Creating an Endpoint

Create an endpoint for a custom, pretrained, or imported model on a hosting dedicated AI cluster in OCI Generative AI.

Important

To add a model to a private endpoint, first create a private endpoint and then return to this page for steps to attach the model.

Private endpoints support pretrained and custom models only. Imported models aren't supported.

  • On the Endpoints list page, select Create endpoint. If you need help finding the list page, see Listing Endpoints.

    Endpoint Information

    1. Select a compartment to create the endpoint in. The default compartment is the same as the list page, but you can select any compartment that you have permission to work in.
      Tip

      We recommend that you create the endpoint in the same compartment as the model.
    2. (Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you don't enter a name, the system generates a name that you can change later.
      The generated name has the format generativeaiendpoint<timestamp>. Example: generativeaiendpoint20250531235319
    3. (Optional) Enter a description for the model.

    Hosting configuration

    1. Select the compartment that hosts the model that you want to add an endpoint to.
    2. Select the model that you want to add an endpoint to. This model can be a custom model, imported model, or a ready-to-use pretrained foundational model available in the region that you're working in.
    3. If the model that you selected has several versions, select a model version.
      For the ready-to-use pretrained foundational models, this field populates when you select the model.
    4. Select a hosting dedicated AI cluster by performing one of the following actions:
      • Select a Dedicated AI cluster from the list. If you created a cluster a few minutes ago, wait for that cluster to become active.
      • Select Create new dedicated AI cluster and perform the following steps:
        1. (Optional) Enter a name and description.
        2. For Base model, select one of the following:
          • The pretrained foundational model that you're hosting.
          • If using a custom model, fine-tuned from a foundational model, select the original foundation (base) model it was trained on.
          • If using an imported model, select that imported model.
        3. If you selected an imported model, select a recommended Unit size based on this guide.
        4. For model replica you need at least one unit for an endpoint.
        5. Read the commitment unit hours for the hosting dedicated AI cluster and select the checkbox to agree to the commitment.
        6. (Optional) Select Add tag and assign tags to this dedicated AI cluster. See Resource Tags.
        7. Select Create and wait for the cluster to become active.
        8. From the Dedicated AI cluster list, select the dedicated AI cluster that you created.

    Networking resources (for pretrained and custom models)

    Select one of the following options:
    • Public endpoint
    • Private endpoint: If you select this option, then select the compartment for the private endpoint, and then the private endpoint that you want to use. (Not available forl imported models.)
    By default, imported models have public endpoints.

    Guardrails (for pretrained and custom models)

    Note

    Guardrails aren't available for imported models.
    1. Select whether to enable the following guardrails.
      • Content moderation
        • Off: Don't apply content moderation and output explicit content.
        • Block: Help identify and apply content moderation.
        • Inform: Don't apply content moderation, but aim to inform the user if the model detects content that needs moderation.
      • Prompt injection (PI) protection
        • Off: Don't apply PI protection and allow unrestricted input.
        • Block: Help identify and protect against prompt injection.
        • Inform: Don't apply PI protection, but aim to inform the user if the model detects content that needs PI protection.
      • Personally identifiable information (PII) protection
        • Off: Don't apply PII protection, Instead, output content without data exposure restrictions.
        • Block: Help identify and protect PII such as help remove personal data from responses.
        • Inform: Don't apply PII protection, but aim to inform the user if the model detects content that needs PII protection.
    2. (Optional) Select Add tag and assign tags to this endpoint. See Resource Tags.
    3. Select Create.
      You're directed to the endpoint details page where you can track the state of the endpoint.
    4. After the endpoint is active, select View in playground and start using the model from this endpoint.
  • Use the endpoint create command and required parameters to create an endpoint:

    oci generative-ai endpoint create 
    --model-id <model-OCID>
    --compartment-id <compartment-OCID> 
    --dedicated-ai-cluster-id <hosting-dedicated-AI-cluster-OCID> 
    [OPTIONS]

    For a complete list of parameters and values for CLI commands, see the CLI Command Reference.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Pretrained Foundational Models in Generative AI.
  • Run the CreateEndpoint operation to create an endpoint.

    Note

    For pretrained models, instead of an OCID, you can use the model name exactly as listed in the Console's playground. You can also find this OCI model name, in the model's detail page in Pretrained Foundational Models in Generative AI.