OCI AI Services
Oracle University Podcast - A podcast by Oracle Corporation - Tuesdays
Categories:
Listen to Lois Houston and Nikita Abraham, along with Senior Principal Product Manager Wes Prichard, as they explore the five core components of OCI AI services: language, speech, vision, document understanding, and anomaly detection, to help you make better sense of all that unstructured data around you. Oracle MyLearn: https://mylearn.oracle.com/ou/learning-path/become-an-oci-ai-foundations-associate-2023/127177 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X (formerly Twitter): https://twitter.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Himanshu Raj, and the OU Studio Team for helping us create this episode. -------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we’ll bring you foundational training on the most popular Oracle technologies. Let’s get started! 00:26 Nikita: Welcome to the Oracle University Podcast! I’m Nikita Abraham, Principal Technical Editor with Oracle University, and with me is Lois Houston, Director of Innovation Programs. Lois: Hi there! In our last episode, we spoke about OCI AI Portfolio, including AI and ML services, and the OCI AI infrastructure. Nikita: Yeah, and in today’s episode, we’re going to continue down a similar path and take a closer look at OCI AI services. 00:55 Lois: With us today is Senior Principal Product Manager, Wes Prichard. Hi Wes! It’s lovely to have you here with us. Hemant gave us a broad overview of the various OCI AI services last week, but we’re really hoping to get into each of them with you. So, let’s jump right in and start with the OCI Language service. What can you tell us about it? Wes: OCI Language analyzes unstructured text for you. It provides models trained on industry data to perform language analysis with no data science experience needed. 01:27 Nikita: What kind of big things can it do? Wes: It has five main capabilities. First, it detects the language of the text. It recognizes 75 languages, from Afrikaans to Welsh. It identifies entities, things like names, places, dates, emails, currency, organizations, phone numbers--14 types in all. It identifies the sentiment of the text, and not just one sentiment for the entire block of text, but the different sentiments for different aspects. 01:56 Nikita: What do you mean by that, Wes? Wes: So let's say you read a restaurant review that said, the food was great, but the service sucked. You'll get food with a positive sentiment and service with a negative sentiment. And it also analyzes the sentiment for every sentence. Lois: Ah, that’s smart. Ok, so we covered three capabilities. What else? Wes: It identifies key phrases in the text that represent the important ideas or subjects. And it classifies the general topic of the text from a list of 600 categories and subcategories. 02:27 Lois: Ok, and then there’s the OCI Speech service... Wes: OCI Speech is very straightforward. It locks the data in audio tracks by converting speech to text. Developers can use Oracle's time-tested acoustic language models to provide highly accurate transcription for audio or video files across multiple languages. OCI Speech automatically transcribes audio and video files into text using advanced deep learning techniques. There's no data science experience required. It processes data directly in object storage. And it generates timestamped, grammatically accurate transcriptions. 03:01 Nikita: What are some of the main features of OCI Speech? Wes: OCI Speech supports multiple languages, specifically English, Spanish, and Portuguese, with more coming in the future. It has batching support where multiple files can be submitted with a single call. It has blazing fast processing. It can transcribe hours of audio in less than 10 minutes. It does this by chunking up your audio into smaller segments, and transcribing each segment, and then joining them all back together into a single file. It provides a confidence score, both per word and per transcription. It punctuates transcriptions to make the text more readable and to allow downstream systems to process the text with less friction. And it has SRT file support. 03:45 Lois: SRT? What’s that? Wes: SRT is the most popular closed caption output file format. And with this SRT support, users can add closed captions to their video. OCI Speech makes transcribed text more readable to resemble how humans write. This is called normalization. And the service will normalize things like addresses, times, numbers, URLs, and more. It also does profanity filtering, where it can either remove, mask, or tag profanity and output text, where removing replaces the word with asterisks, and masking does the same thing, but it retains the first letter, and tagging will leave the word in place, but it provides tagging in the output data. 04:29 Nikita: And what about OCI Vision? What are its capabilities? Wes: Vision is a computed vision service that works on images, and it provides two main capabilities-- image analysis and document AI. Image analysis analyzes photographic images. Object detection is the feature that detects objects inside an image using a bounding box and assigning a label to each object with an accuracy percentage. Object detection also locates and extracts text that appears in the scene, like on a sign. Image classification will assign classification labels to the image by identifying the major features in the scene. One of the most powerful capabilities of image analysis is that, in addition to pretrained models, users can retrain the models with their own unique data to fit their specific needs. 05:20 Lois: So object detection and image classification are features of image analysis. I think I got it! So then what’s document AI? Wes: It's used for working with document images. You can use it to understand PDFs or document image types, like JPEG, PNG, and Tiff, or photographs containing textual information. 05:40 Lois: And what are its most important features? Wes: The features of document AI are text recognition, also known as OCR or optical character recognition. And this extracts text from images, including non-trivial scenarios, like handwritten texts, plus tilted, shaded, or rotated documents. Document classification classifies documents into 10 different types based on visual appearance, high-level features, and extracted keywords. This is useful when you need to process a document, based on its classification, like an invoice, a receipt, or a resume. Language detection analyzes the visual features of text to determine the language rather than relying on the text itself. Table extraction identifies tables in docs and extracts their content in tabular form. Key value extraction finds values for 13 common fields and line items in receipts, things like merchant name and transaction date. 06:41 Want to get the inside scoop on Oracle University? Head over to the Oracle University Learning Community. Attend exclusive events. Read up on the latest news. Get first-hand access to new products. Read the OU Learning Blog. Participate in Challenges. And stay up-to-date with upcoming certification opportunities. Visit mylearn.oracle.com to get started. 07:06 Nikita: Welcome back! Wes, I want to ask you about OCI Anomaly Detection. We discussed it a bit last week and it seems like such an intelligent and efficient service. Wes: Oracle Cloud Infrastructure Anomaly Detection identifies anomalies in time series data. Equipment sensors generate time series data, but all kinds of business metrics are also time-based. The unique feature of this service is that it finds anomalies, not just in a single signal, but across many signals at once. That's important because machines often generate multiple signals at once and the signals are often related. 07:42 Nikita: Ok you need to give us an example of this! Wes: Think of a pump that has an output pressure, a flow rate, an RPM, and an electrical current draw. When a pump's going to fail, anomalies may appear across several of those signals but at different times. OCI Anomaly Detection helps you to identify anomalies in a multivariate data set by taking advantage of the interrelationship among signals. The service contains algorithms for both multi-signal, as in multivariate, single signal, as in univariate anomaly detection, and it automatically determines which algorithm to use based on the training data provided. The multivariate algorithm is called MSET-2, which stands for Multivariate State Estimation technique, and it's unique to Oracle. 08:28 Lois: And the 2? Wes: The 2 in the name refers to the patented enhancements by Oracle labs that automatically identify and fix data quality issues resulting in fewer false alarms and more accurate results. Now unlike some of the other AI services, OCI Anomaly Detection is always trained on the customer's data. It's trained using actual historical data with no anomalies, and there can be as many different trained models as needed for different sets of signals. 08:57 Nikita: So where would one use a service like this? Wes: One of the most obvious applications of this service is for predictive maintenance. Early warning of a problem provides the opportunity to deploy maintenance resources and schedule downtime to minimize disruption to the business. 09:12 Lois: How would you train an OCI Anomaly Detection model? Wes: It's a simple four-step process to prepare a model that can be used for anomaly detection. The first step is to obtain training data from the system to be monitored. The data must contain no anomalies and should cover the normal range of values that would be experienced in a full business cycle. Second, the training data file is uploaded to an object storage bucket. Third, a data set is created for the training data. So a data set in this context is an object in the OCI Anomaly Detection service to manage data used for training and testing models. And fourth, the model is trained. A wizard in the user interface steps the user through the required inputs, such as the training data set and some training parameters like the target false alarm probability. 10:02 Lois: How would this service know about the data and whether the trained model is univariate or multivariate? Wes: When training OCI Anomaly Detection models, the user does not need to specify whether the intended model is for multivariate or univariate data. It does this detection automatically. For example, if a model is trained with 10 signals and 5 of those signals are determined to be correlated enough for multivariate anomaly detection, it will create an internal multivariate model for those signals. If the other five signals are not correlated with each other, it will create an internal univariate model for each one. From the user's perspective, the result will be a single OCI anomaly detection model for the 10 signals. But internally, the signals are treated differently based on the training. A user can also train a model on a single signal and it will result in a univariate model. 10:55 Lois: What does this OCI Anomaly Detection model training entail? How does it ensure that it does not have any false alarms? Wes: Training a model requires a single data file with no anomalies that should cover a complete business cycle, which means it should represent all the normal variations in the signal. During training, OCI Anomaly Detection will use a portion of the data for training and another portion for automated testing. The fraction used for each is specified when the model is trained. When model training is complete, it's best practice to do another test of the model with a data set containing anomalies to see if the anomalies are detected and if there are any false alarms. Based on the outcome, the user may want to retrain the model and specify a different false alarm probability, also called F-A-P or FAP. The FAP is the probability that the model would produce a false alarm. The false alarm probability can be thought of as the sensitivity of the model. The lower the false alarm probability, the less likelihood of it reporting a false alarm, but the less sensitive it will be to detecting anomalies. Selecting the right FAP is a business decision based on the need for sensitive detections balanced by the ability to tolerate false alarms. Once a model has been trained and the user is satisfied with its detection performance, it can then be used for inferencing. 12:23 Nikita: Inferencing? Is that what I think it is? Wes: New data is submitted to the model and OCI Anomaly Detection will respond with anomalies that are detected. The input data must contain the same signals that the model was trained on. So, for example, if the model was trained on signals A, B, C, and D, then for detection inferencing, the same four signals must be provided. No more, no less. 12:46 Lois: Where can I find the features of OCI Anomaly Detection that you mentioned? Wes: The training and inferencing features of OCI Anomaly Detection can be accessed through the OCI console. However, a human-driven interface is not efficient for most business scenarios. In most cases, automating the detection of anomalies through software is preferred to be able to process hundreds or thousands of signals using many trained models. The service provides multiple software interfaces for this purpose. Each trained model is accessible through a REST API and an HTTP endpoint. Additionally, programming language-specific SDKs are available for multiple languages, including Python. Using the Python SDK, data scientists can work with OCI Anomaly Detection for both training and inferencing in an OCI Data Science notebook. 13:37 Nikita: How can a data scientist take advantage of these capabilities? Wes: Well, you can write code against the REST API or use any of the various language SDKs. But for data scientists working in OCI Data Science, it makes sense to use Python. 13:51 Lois: That’s exciting! What does it take to use the Python SDK in a notebook… to be able to use the AI services? Wes: You can use a Notebook session in OCI Data Science to invoke the SDK for any of the AI services. This might be useful to generate new features for a custom model or simply as a way to consume the service using a familiar Python interface. But before you can invoke the SDK, you have to prepare the data science notebook session by supplying it with an API Signing Key. Signing Key is unique to a particular user and tenancy and authenticates that user to OCI when invoking the SDK. So therefore, you want to make sure you safeguard your Signing Key and never share it with another user. 14:34 Nikita: And where would I get my API Signing Key? Wes: You can obtain an API Signing Key from your user profile in the OCI Console. Then you save that key as a file to your local machine. The API Signing Key also provides commands to be added to a config file that the SDK expects to find in the environment, where the SDK code is executing. The config file then references the key file. Once these files are prepared on your local machine, you can upload them to the Notebook session, where you will execute SDK code for the AI service. The API Signing Key and config file can be reused with any of your notebook sessions, and the same files also work for all of the AI services. So, the files only need to be created once for each user and tenancy combination. 15:27 Lois: Thank you so much, Wes, for this really insightful discussion. To learn more about the topics covered today, you can visit mylearn.oracle.com and search for the Oracle Cloud Infrastructure AI Foundations course. Nikita: And remember, that course prepares you for the Oracle Cloud Infrastructure AI Foundations Associate certification that you can take for free! So, don’t wait too long to check it out. Join us next week for another episode of the Oracle University Podcast. Until then, this is Nikita Abraham… Lois Houston: And Lois Houston, signing off! 16:03 That’s all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We’d also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.