Creating an Endpoint in Generative AI

Create an endpoint for a custom or pretrained model on a hosting dedicated AI cluster in OCI Generative AI.

For rules about creating endpoints for the models hosted on clusters, see Adding Endpoints to Hosting Clusters.

In the navigation bar of the Console, select a region with Generative AI, for example, US Midwest (Chicago) or UK South (London). See which models are offered in your region.
Open the navigation menu and select Analytics & AI. Under AI Services, select Generative AI.
Select the compartment that contains the custom model that you want to add an endpoint to.
Perform one of the following actions:
- To create an endpoint for a custom model with the model name and version pre-populated:
  1. Select Custom models.
  2. Select the name of the custom model that you want to add an endpoint for.
  3. Find the foundational base model for the custom model. You select the base model when you match the model to a cluster in the following steps.
  4. Under Resources, select Endpoints.
  5. Select Create endpoint.
- To create an endpoint for a ready-to-use pretrained foundational model or a custom model:
  1. Select Endpoints.
  2. Select Create endpoint
(Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you don't enter a name, the system generates a name that you can change later.

The generated name has the format generativeaiendpoint<timestamp>.

generativeaiendpoint20240531235319
If not selected, select the model name and version that you want to add an endpoint for.
Tip
- If the model is in a different compartment than the current compartment, select Change compartment and select the compartment that hosts the model. We recommend that you create the endpoint in the same compartment as the model.
- If the custom model that you're looking for isn't listed, select Cancel. Then under Generative AI, select Custom models and ensure that the custom model is in an active state.
Select a hosting dedicated AI cluster by performing one of the following actions:
- If you already have a cluster, select a Dedicated AI cluster from the drop-down list. If you just created a cluster, wait for that cluster to become active. Ensure that the base model that 's associated with this cluster matches the base model of the custom model.
- To create a cluster, in the Dedicated AI cluster drop-down list, select Create new dedicated AI cluster and perform the following steps:
  1. (Optional) Enter a name and description.
  2. Select a Base model that matches the base model of the model that you want to host.
  3. Add 1 model replica to the endpoint. When you create a cluster you need at least one unit for an endpoint. For an existing cluster, you can use that same unit to host new endpoints. Each instance hosts all the active endpoints. Increasing the instance count on a cluster, increases the number of supported RPMs for all active endpoints hosted on a cluster.
  4. Read the commitment unit hours for the hosting dedicated AI cluster and select the checkbox to agree to the commitment.
  5. Select Create and wait for the cluster to become active.
  6. From the Dedicated AI cluster drop-down list, select the dedicated AI cluster that you created.
Select whether to enable the following guardrails.
- Content moderation
  - Off: Don't apply content moderation and output explicit content.
  - Block: Help identify and apply content moderation.
  - Inform: Don't apply content moderation, but aim to inform the user if the model detects content that needs moderation.
- Prompt injection (PI) protection
  - Off: Don't apply PI protection and allow unrestricted input.
  - Block: Help identify and protect against prompt injection.
  - Inform: Don't apply PI protection, but aim to inform the user if the model detects content that needs PI protection.
- Personally identifiable information (PII) protection
  - Off: Don't apply PII protection, Instead, output content without data exposure restrictions.
  - Block: Help identify and protect PII such as help remove personal data from responses.
  - Inform: Don't apply PII protection, but aim to inform the user if the model detects content that needs PII protection.
(Optional) Select Show advanced options and assign tags to the endpoint.
Select Create endpoint.
You're directed to the endpoint details page where you can track the state of the endpoint.
After the endpoint is active, select View in playground and start using the model from this endpoint.

Oracle Cloud Infrastructure Documentation

Creating an Endpoint in Generative AI