Model Deployments

Troubleshoot your model deployments.

Debugging a Model Deployment Failure

After creating a new deployment or updating an existing employment, you might see a failure. These steps show how to debug the issue:

  1. On your project's home page, select Model Deployments.
  2. Select the model deployment name or select the Actions menu Actions Menu for the model deployment and select View Details. Next, check the work requests
  3. Under Resources, select Work Request .

    The work requests appear at the bottom of the page.

  4. On the Work Requests Information page, select Log Messages.
  5. If any failures occur in the creation steps, under Resources, select Error Messages.
  6. If the work request shows success, then review the OCI predict logs to identify any errors.

    Logs are attached to the model deployment when it's created.

  7. If logs are attached, select the predict log name to see the log.
  8. Select Explore with Log Search.
  9. Change the filter time to increase the period.

Conda Environment Path isn't Accessible

Ensure that conda environment path is valid, and that you have configured the appropriate policy for a published conda environment. The conda environment path must remain valid and accessible throughout the lifecycle of the model deployment to ensure availability and proper functioning of the deployed model.

Error Occurred when Starting the Web Server

Enable the model deployment predict logs to help you debug the errors. Generally, this happens when your code has issues or is missing required dependencies.

Invoking a Model Deployment Failure

When a model deployment is in an active lifecycleState, the predict or streaming endpoint can be invoked. The prediction response can return a failure for many reasons. Use these suggestions to try to resolve these errors:

  1. Ensure that the input passed in the request is in a valid JSON format and matches the expected input by the model.

  2. Review the attached access logs for errors.

  3. Ensure that the user has the correct access rights.

  4. Ensure that the score.py file doesn't contain errors.

  5. If predictions are returning different results (success, fail) each time the prediction is called for the same input, it's possible that the allocated resources aren't enough to server the model prediction. You can edit the load balancer bandwidth to increase it and the Compute core count to serve more requests in parallel.

Too Many Requests (Status 429)

If you're getting this error when calling the inference endpoint, it means the requests are getting throttled.

The solution depends on the error type:

Load Balancer bandwidth limit exceeded
Edit the Model Deployment to increase its Load Balancer bandwidth. You can estimate the bandwidth using the expected number of requests in seconds, and the combined size of the request and response payload per request.
Tenancy request-rate limit exceeded

Each prediction endpoint allows a specific number of requests in a certain time interval (minutes or seconds) per tenant by default. for more information, see, the Invoking a Model Deployment documentation. Open a support ticket from the OCI Console to submit a request for the limit to be increased.