Predict Endpoint
The /predict
endpoint in model deployment lets clients submit input
data and receive the complete prediction results in a single response. This endpoint is suitable
for scenarios where the entire prediction output is required immediately.
HTTP Status Code | Error Code | Description | Retry |
---|---|---|---|
200 |
None |
200 Success.
|
None |
404 |
NotAuthorizedOrNotFound |
Model deployment not found or authorization failed. |
No |
405 |
MethodNotAllowed |
Method not allowed. |
No |
411 |
LengthRequired |
Missing content length header. |
No |
413 |
PayloadTooLarge |
The payload size limit is 10 MB. |
No |
429 |
TooManyRequests |
Too Many Requests.
|
Yes, with backoff |
500 |
InternalServerError |
Internal Server Error.
|
Yes, with backoff |
503 |
ServiceUnavailable |
Model server unavailable. |
Yes, with backoff |
Invoking with the OCI Python SDK
import requests
import oci
from oci.signer import Signer
import json
# model deployment endpoint. Here we assume that the notebook region is the same as the region where the model deployment occurs.
# Alternatively you can also go in the details page of your model deployment in the OCI console.
# Under "Invoke Your Model", you will find the HTTP endpoint of your model.
endpoint = <your-model-deployment-uri>
# your payload:
input_data = <your-json-payload-str>
if using_rps: # using resource principal:
auth = oci.auth.signers.get_resource_principals_signer()
else: # using config + key:
config = oci.config.from_file("~/.oci/config") # replace with the location of your oci config file
auth = Signer(
tenancy=config['tenancy'],
user=config['user'],
fingerprint=config['fingerprint'],
private_key_file_location=config['key_file'],
pass_phrase=config['pass_phrase'])
# post request to model endpoint:
response = requests.post(endpoint, json=input_data, auth=auth)
# Check the response status. Success should be an HTTP 200 status code
assert response.status_code == 200, "Request made to the model predict endpoint was unsuccessful"
# print the model predictions. Assuming the model returns a JSON object.
print(json.loads(response.content))
Invoking with the OCI CLI
Use a model deployment in the CLI by invoking it.
The CLI is included in the OCI Cloud Shell environment, and is preauthenticated. This example invokes a model deployment with the CLI:
oci raw-request --http-method POST --target-uri
<model-deployment-url>/predict --request-body '{"data": "data"}'
# Enable a realm-specific endpoint: https://docs.oracle.com/iaas/tools/oci-cli/3.56.0/oci_cli_docs/oci.html#cmdoption-realm-specific-endpoint
export OCI_REALM_SPECIFIC_SERVICE_ENDPOINT_TEMPLATE_ENABLED=true
oci model-deployment inference-result predict
--model-deployment-id <model-deployment-url> --request-body {"data": "data"}