Prediction Services
ML Applications provides a secure way to expose prediction services on stable prediction endpoints while handling authentication and authorization. ML Applications supports both IAM and OAuth authentication.
Prediction endpoints remain stable and unaffected by backend changes. Users can find the
exact endpoint URL in their ML Application Instance details. To construct a request manually,
the following is all you need to know:
- Region
- Where the ML Application Instance was provisioned
- Instance ID
- The OCID of the ML Application Instance.
- Use case name
- The prediction use case, which must match the display name of the model deployment
used by the ML Application Instance. Note
The name of the prediction use case must match the display name of the model deployment used by the ML Application instance.
Client applications don't interact directly with model deployments used in the implementation. Instead, prediction requests are routed based on the instance ID and the specified use case, ensuring that:
- requests are dispatched to the correct model deployment.
- each prediction request, when processed by the model deployment, runs under the identity
of the ML Application Instance (
datasciencemlappinstance
Resource Principal).
Model deployments serve as the implementation backend for prediction services, offering:
- High availability using a load-balanced infrastructure for reliability.
- Autoscaling giving automatic scaling based on demand.
For more details on model deployments, see the Model Deployments chapter.
You can test your prediction services using
oci
CLI in the Cloud
Shell:oci raw-request
--http-method POST
--target-uri "<Copy_URI_from_the_ML_Application_instance_view_detail>"
--request-body '{"data":"<your payload>"}'
Note
Grant ML Applications to dispatch prediction requests to your backed model deployments.
Grant ML Applications to dispatch prediction requests to your backed model deployments.
For example, you can define the policy
thus:
allow any-user to {DATA_SCIENCE_MODEL_DEPLOYMENT_PREDICT}
in compartment id <COMPARTMENT>
where all {request.principal.type = 'datasciencemlapp', request.principal.compartment.id = target.compartment.id}
Note
Because the OCI SDK doesn't provide built-in support for prediction calls, we recommend implementing standard resiliency patterns, such as client-side retries with reasonable timeouts, circuit breakers, and so on, which the OCI SDK supports by default.
Because the OCI SDK doesn't provide built-in support for prediction calls, we recommend implementing standard resiliency patterns, such as client-side retries with reasonable timeouts, circuit breakers, and so on, which the OCI SDK supports by default.