Creating an Endpoint in Generative AI

Create an endpoint for a custom or pretrained model on a hosting dedicated AI cluster in OCI Generative AI.

  1. In the navigation bar of the Console, select a region with Generative AI, for example, US Midwest (Chicago) or UK South (London). See which models are offered in your region.
  2. Open the navigation menu and click Analytics & AI. Under AI Services, click Generative AI.
  3. Select the compartment that contains the custom model that you want to add an endpoint to.
  4. Perform one of the following actions:
    • To create an endpoint for a custom model with the model name and version pre-populated:
      1. Click Custom models.
      2. Click the name of the custom model that you want to add an endpoint for.
      3. Check the base model for the custom model to match it to a cluster in the following steps. For example, cohere.command-r-plus.
      4. Under Resources, click Endpoints.
      5. Click Create endpoint.
    • To create an endpoint for a ready-to-use pretrained foundational model or a custom model:
      1. Click Endpoints.
      2. Click Create endpoint
  5. (Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you don't enter a name, the system generates a name that you can change later.

    The generated name has the format generativeaiendpoint<timestamp>.

    generativeaiendpoint20240531235319

  6. (Optional) To moderate the model's generated responses turn on Content moderation toggle. This option is off by default. Learn about Content Moderation. You can add this feature later when you edit the endpoint.
  7. If not selected, select the model name and version that you want to add an endpoint for.
    Tip

    • If the model is in a different compartment than the current compartment, click Change compartment and choose the compartment that hosts the model. We recommend that you create the endpoint in the same compartment as the model.
    • If the custom model that you're looking for isn't listed, click Cancel. Then under Generative AI, click Custom models and ensure that the custom model is in an active state.
  8. Select a hosting dedicated AI cluster by performing one of the following actions:
    • If you already have a cluster, select a Dedicated AI cluster from the drop-down list. If you just created a cluster, wait for that cluster to become active. Ensure that the base model that 's associated with this cluster matches the base model of the custom model.
    • To create a cluster, in the Dedicated AI cluster drop-down list, click Create new dedicated AI cluster and perform the following steps:
      1. (Optional) Enter a name and description.
      2. Select a Base model that matches the base model of the model that you want to host.
      3. Add 1 model replica to the endpoint. When you create a cluster you need at least one unit for an endpoint. For an existing cluster, you can use that same unit to host new endpoints. Each instance hosts all the active endpoints. Increasing the instance count on a cluster, increases the number of supported RPMs for all active endpoints hosted on a cluster.
      4. Read the commitment unit hours for the hosting dedicated AI cluster and select the checkbox to agree to the commitment.
      5. Click Create and wait for the cluster to become active.
      6. From the Dedicated AI cluster drop-down list, click the dedicated AI cluster that you created.
  9. (Optional) Click Show advanced options and assign tags to the endpoint.
  10. Click Create endpoint.
    You're directed to the endpoint details page where you can track the state of the endpoint.
  11. After the endpoint is active, click View in playground and start using the model from this endpoint.