Prediction Services

ML Applications provides a secure way to expose prediction services on stable prediction endpoints while handling authentication and authorization. ML Applications supports both IAM and OAuth authentication.

Prediction endpoints remain stable and unaffected by backend changes. Users can find the exact endpoint URL in their ML Application Instance details. To construct a request manually, the following is all you need to know:
Region
Where the ML Application Instance was provisioned
Instance ID
The OCID of the ML Application Instance.
Use case name
The prediction use case, which must match the display name of the model deployment used by the ML Application Instance.
Note

The name of the prediction use case must match the display name of the model deployment used by the ML Application instance.

Client applications don't interact directly with model deployments used in the implementation. Instead, prediction requests are routed based on the instance ID and the specified use case, ensuring that:

  • requests are dispatched to the correct model deployment.
  • each prediction request, when processed by the model deployment, runs under the identity of the ML Application Instance (datasciencemlappinstance Resource Principal).

Model deployments serve as the implementation backend for prediction services, offering:

  • High availability using a load-balanced infrastructure for reliability.
  • Autoscaling giving automatic scaling based on demand.

For more details on model deployments, see the Model Deployments chapter.

You can test your prediction services using oci CLI in the Cloud Shell:
oci raw-request
 --http-method POST
 --target-uri "<Copy_URI_from_the_ML_Application_instance_view_detail>"
 --request-body '{"data":"<your payload>"}'
Note

Grant ML Applications to dispatch prediction requests to your backed model deployments.
For example, you can define the policy thus:
allow any-user to {DATA_SCIENCE_MODEL_DEPLOYMENT_PREDICT} 
in compartment id <COMPARTMENT> 
            where all {request.principal.type = 'datasciencemlapp', request.principal.compartment.id = target.compartment.id}
Note

Because the OCI SDK doesn't provide built-in support for prediction calls, we recommend implementing standard resiliency patterns, such as client-side retries with reasonable timeouts, circuit breakers, and so on, which the OCI SDK supports by default.