Metrics are available to analyze the custom
model's performance.
Create the Dataset
Vision custom models are intended for users without a
data science background. By creating a dataset, and instructing Vision to train a model based on the dataset, you can have a
custom model ready for your scenario.
The key to building a useful custom model is preparing and training it with a good
dataset. Vision supports the following dataset
format:
Collect a dataset that's representative of the problem and space you intend to
apply the trained model on. While data from other domains might work, a dataset
generated from the same intended devices, environments, and conditions of use,
outperforms any other.
Data labeling is the process of identifying properties of records, such as,
documents, text, and images, and annotating them with labels to identify those
properties. The caption of an image and identification of an object in an image are both
examples of a data label. You can use Oracle Cloud Infrastructure Data Labeling
to do the data labeling. For more information, see the Data Labeling service guide. Here is an
outline of the steps to take:
Collect enough of images that match the distribution of the intended
application.
When choosing how many images are needed for your dataset, use as many images
as you can in your training dataset. For each label to be detected, provide
at least 10 images for the label. Ideally provide 50 or more images per
label. The more images you provide the better the detection robustness and
accuracy. Robustness is the ability to generalize to new conditions such as
view angle or background.
Collect a few varieties of other images to capture different camera capture
angles, lighting conditions, backgrounds, and others.
Collect a dataset that's
representative of the problem and space you intend to apply the trained model
on. While data from other domains might work, a dataset generated from the same
intended devices, environments, and conditions of use, outperforms any
other.
Provide enough perspectives for the images, as the model uses not only
the annotations to learn what is correct, but also the background to learn what is
wrong. For example, provide views from different sides of the object detected, with
different lighting conditions, from different image capture devices, and so on.
Label all instances of the objects that occur in the sourced dataset.
Keep the
labels consistent. If you label many apples together as one apple, do so
consistently in each image. Don't have space between the objects and the
bounding box. The bounding boxes must closely match the objects labeled.
Important
Verify each of these annotations as they're important for
the model's performance.
Building a Custom Model 🔗
Build custom models in Vision to extract insights
from images without needing data scientists.
You need the following before building a custom model:
A paid tenancy account in Oracle Cloud Infrastructure.
Familiarity with Oracle Cloud Infrastructure Object Storage.
Using the Console, learn how to create a Vision project, and how to train an image classification and
object detection model.
Create a project.
From the Vision home page, under
Custom Models, select
Projects.
Select Create project
Select the compartment for the project.
Enter a Name and description for the project.
Avoid entering confidential information.
Select Create project.
In the list of projects, select the name of the project that you created.
On the project details page, select Create Model.
Select the Model type to train: Image classification or
Object detection.
Select the training data.
If you don't have any annotated images, select Create a new
dataset.
You're taken to OCI Data Labeling, where
you can create a dataset and add labels or draw bounding boxes over
the image content. For more information, see Creating a Dataset and the
section on labeling images in the
Data Labeling documentation.
If you have an existing annotated dataset, select Choose
existing dataset and then select the data source:
If you annotated the dataset in Data Labeling, select Data
labeling service and then select the dataset.
If you annotated the images by using a third-party tool, click
Object storage and then select the bucket that
contains the images.
Select Next.
Enter a display name for the custom model.
(Optional)
Give the model a description to help you find it.
Select the Training duration.
Recommended training
Vision automatically selects the training
duration to create the best model. The training might take up to 24
hours.
Quick training This option produces a model
that's not fully optimized but is available in about an hour.
Custom This option lets you to set your own
maximum training duration (in hours).
Select Next.
Review the information you provided in the previous steps. To make any changes,
select Previous.
When you want to start training the custom model, select Create and
train.
Use the create command and required parameters to
create a
project:
Command
CopyTry It
oci ai-vision project create [OPTIONS]
Use
the create command and required parameters to
create a
model:
Command
CopyTry It
oci ai-vision model create [OPTIONS]
For a complete list of flags and variable options for CLI commands, see the CLI Command Reference.
First, run the CreateProject operation to create a
project.
Then run the CreateModel operation to create a
model.
Train the Custom Model 🔗
After creating your dataset, you can train your custom model.
Train your model using one of Vision's custom
model training modes. The training modes are:
Recommended training: Vision
automatically selects the training duration to create the best model. The
training might take up to 24 hours.
Quick training: This option produces a model that's not fully optimized
but is available in about an hour.
Custom duration: This option lets you set your own maximum training
duration.
The best training duration depends on the complexity of your detection problem, the typical
number of objects in an image, the resolution, and other factors. Consider these needs,
and allocate more time as the training complexity increases. The minimum amount of
training time recommended is 30 minutes. A longer training time gives greater accuracy,
but diminishing returns in accuracy with time. Use the quick training mode to get an
idea of the smallest amount of time it takes to get a model that provides reasonable
performance. Use the recommended mode to get a base optimized model. If you want a
better result, increase the training time.
Call the Custom Model 🔗
Custom models can be called the same as you would call the pretrained
model.
You can call the custom model to analyze images as a single request, or as a batch
request. You must have done these steps first:
Open the navigation menu and click Analytics & AI. Under AI
Services, click Vision.
On the Vision page, click Video Analysis.
Select the compartment where you want to store the results.
Select the location of the video:
Demo
Local file
Object storage
(Optional) If you selected Demo, click
Analyze demo video to start the
analysis.
(Optional) If you selected Local file:
Select a bucket from the list. If the bucket is in a different
compartment, click Change
compartment.
(Optional) Enter a prefix in the Add prefix text
field.
Drag the video file to the Select file area, or
click select one... and browse to the image.
Click Upload and analyze. The
Pre-Authenticated URL for video dialog box is
displayed.
(Optional) Copy the URL.
Click Close.
If you selected Object storage, enter the
video URL and click Analyze.
The analyzeVideo API is invoked, and the model immediately analyzes the
video. The status of the job is displayed.
The Results area has tabs for each of Label detection, Object
detection, Text detection, and Face detection with confidence scores, and
the request and response JSON.
(Optional)
To stop the job running click Cancel.
(Optional)
To change the output location, click Change output
location.
(Optional)
To select what is analyzed, click Video analysis
capabilities, and select as appropriate from:
Label detection
Object detection
Text detection
Face detection
(Optional)
To generate code for video inferencing, click Code for video
inferencing.
(Optional)
To analyze videos again, click Video job tracker, and
select Recently uploaded videos from the menu.
Click the video you want to analyze.
Click Analyze.
To see the status of a video analysis job, click Video job
tracker, and select Get job status from
the menu.
Enter the job OCID.
Click Get job status.
(Optional) To stop the job running click Cancel.
(Optional) To get the status of another job, click Get
another video job status.
(Optional) To get the JSON response, click Fetch response
data.
(Optional) To remove a job status, click
Remove.
Use the analyze-video command and required
parameters to classify the
image:
Command
CopyTry It
oci ai-vision analyze-video [OPTIONS]
For
a complete list of flags and variable options for CLI commands, see the CLI Command Reference.
Run the AnalyzeVideo operation to analyze an
image.
Custom Model Metrics 🔗
The following metrics are provided for custom models in Vision.
mAP@0.5 score
The mean Average Precision (mAP) score with a threshold of 0.5 is provided only for
custom object detection models. calculated by taking the mean Average Precision over all
classes. It ranges from 0.0 to 1.0 where 1.0 is the best result.
Precision
The fraction of relevant instances among the retrieved instances.
Recall
The fraction of relevant instances that were retrieved.
Threshold
The decision threshold to make a class prediction for the metrics.
Total images
The total number of images used for training and testing.
Test images
The number of images from the dataset that were used for testing and not used for
training.
Training duration
The length of time in hours that the model was trained.