Invoking a Model Deployment using a Private Endpoint

A model deployment configured with a private endpoint is only accessible through a private network. It can't be accessed through a public endpoint.

For more information on creating a private endpoint, see Creating a Private Endpoint.
Note

This feature is only available in the OC1 realm. For other realms create a service request.

The following steps in the Console ensure the application can access the private endpoint:

  1. Configure the Virtual cloud network (VCN) and subnet.

    The private endpoint connection is at the VCN level. If you have many subnets per VCN, you need to create only one private endpoint for that VCN. Ensure that security rules meet your requirements.

  2. (Optional) Configure Network security groups.
  3. Ensure that the subnet gives access to the private endpoint resource by setting up a security rule for ingress.
  4. Ensure that the subnet has available IP addresses.

    If no IP addresses are available in the specified subnet, then the work request for creating the private endpoint fails. For more information, see Private Endpoint Creation Failure.

    When the endpoint resource is reachable from the application, the predict request to the model deployment can be invoked through the private endpoint URL.

To invoke a model deployment through a private endpoint from the CLI, use the example command and required parameters. If the Notebook session instance is used to access a private model deployment, create it with a network type of custom networking that also resides on the same VCN and subnet as the private endpoint resource. For more information, see Creating a Notebook Session.

Run the following command using a Notebook session instance or a Cloud Shell instance that has access to the same VCN and subnet as that of the private endpoint resource:

oci model-deployment inference-result --endpoint <private-endpoint-url> predict --model-deployment-id <model-deployment-url> --request-body {"data": "data"}
oci model-deployment inference-result --endpoint <private-endpoint-url> predict-with-response-stream --file '-' --model-deployment-id <model-deployment-url> --request-body {"data": "data"}