Creating an Endpoint

Create an endpoint for a custom or pretrained model on a hosting dedicated AI cluster in OCI Generative AI.

Note

For endpoint rules, see the base model of the model you want to add an endpoint to.
  • On the Endpoints list page, select Create endpoint. If you need help finding the list page, see Listing Endpoints.

    Endpoint Information

    1. Select a compartment to create the model in. The default compartment is the same as the list page, but you can select any compartment that you have permission to work in.
      Tip

      We recommend that you create the endpoint in the same compartment as the model.
    2. (Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you don't enter a name, the system generates a name that you can change later.
      The generated name has the format generativeaiendpoint<timestamp>. Example: generativeaiendpoint20250531235319
    3. (Optional) Enter a description for the model.

    Hosting configuration

    1. Select the compartment that hosts the model that you want to add an endpoint to.
    2. Select the model that you want to add an endpoint to. This model can be a custom model or a ready-to-use pretrained foundational model available in the region that you're working in.
    3. If the model that you selected has several versions, select a model version.
      For the ready-to-use pretrained foundational models, this field populates when you select the model.
    4. Select a hosting dedicated AI cluster by performing one of the following actions:
      • Select a Dedicated AI cluster from the list. If you created a cluster a few minutes ago, wait for that cluster to become active. Ensure that the base model that's associated with this cluster matches the base model for the model that you want to add an endpoint to.
      • Select Create new dedicated AI cluster and perform the following steps:
        1. (Optional) Enter a name and description.
        2. Select a Base model that matches the base model of the model that you want to host.
        3. Add 1 model replica to the endpoint. When you create a cluster you need at least one unit for an endpoint. For an existing cluster, you can use that same unit to host new endpoints. Each instance hosts all the active endpoints. Increasing the instance count on a cluster, increases the number of supported RPMs for all active endpoints hosted on a cluster.
        4. Read the commitment unit hours for the hosting dedicated AI cluster and select the checkbox to agree to the commitment.
        5. (Optional) Select Add tag and assign tags to this dedicated AI cluster. See Resource Tags.
        6. Select Create and wait for the cluster to become active.
        7. From the Dedicated AI cluster list, select the dedicated AI cluster that you created.

    Guardrails

    1. Select whether to enable the following guardrails.
      • Content moderation
        • Off: Don't apply content moderation and output explicit content.
        • Block: Help identify and apply content moderation.
        • Inform: Don't apply content moderation, but aim to inform the user if the model detects content that needs moderation.
      • Prompt injection (PI) protection
        • Off: Don't apply PI protection and allow unrestricted input.
        • Block: Help identify and protect against prompt injection.
        • Inform: Don't apply PI protection, but aim to inform the user if the model detects content that needs PI protection.
      • Personally identifiable information (PII) protection
        • Off: Don't apply PII protection, Instead, output content without data exposure restrictions.
        • Block: Help identify and protect PII such as help remove personal data from responses.
        • Inform: Don't apply PII protection, but aim to inform the user if the model detects content that needs PII protection.
    2. (Optional) Select Add tag and assign tags to this endpoint. See Resource Tags.
    3. Select Create.
      You're directed to the endpoint details page where you can track the state of the endpoint.
    4. After the endpoint is active, select View in playground and start using the model from this endpoint.
  • Use the endpoint create command and required parameters to create a custom model:

    oci generative-ai endpoint create 
    --model-id <model-OCID>
    --compartment-id <compartment-OCID> 
    --dedicated-ai-cluster-id <hosting-dedicated-AI-cluster-OCID> 
    [OPTIONS]

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Pretrained Foundational Models in Generative AI.
  • Run the CreateEndpoint operation to create a custom model.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Pretrained Foundational Models in Generative AI.