Managing Imported Models (New)

In addition to using the hosted pretrained models in OCI Generative AI, you can import supported open source and third-party models (for example, from Hugging Face) into OCI Generative AI, host them, create endpoints, and use them like any other model.

Hugging Face Prerequisites

Before you import a model directly from Hugging Face:

Decide which supported model from Hugging Face to import and note its recommended dedicated AI cluster unit size.
To access and use some models, you need a Hugging Face token, especially the more recent and gated versions such as Llama 3 and Llama 3.1 For these models, generate an access token from your Hugging Face account settings under Access Tokens. Ensure it has the necessary permissions (at least "read" access).

Object Storage Prerequisites

Before you import a model from an Object Storage bucket:

If you're not an OCI admin, ask one to give you IAM permission to manage Object Storage in your compartment:

allow group <your-group-name> to manage object-family in compartment <compartment-with-bucket>

Decide on a model that works with the /v1/chat/completions endpoint—only these models are supported.
Ensure the model supports only one of these capabilities:
- TEXT_TO_TEXT: text in, text out
- IMAGE_TEXT_TO_TEXT: image or text or both in, text out
- EMBEDDING: text in, vector embeddings out
- RERANK: query and candidate documents in, relevance scores and a reordered list out
Save model artifacts in an Object Storage bucket.
Important: The configuration file must be called config.json for a successful import, similar to most Hugging Face models.

Resource Request & Pricing

To reach an imported model, you create an endpoint for that model on a dedicated AI cluster. Use the following table to request for dedicated AI cluster resources before you import a model.

Dedicated AI Cluster Unit Sizes for Imported Models
Dedicated AI Cluster Unit Size	Limit Name	Request Required Units	AI Unit Count
A10_X1	`dedicated-unit-a10-count`	1	1.77
A10_X2	`dedicated-unit-a10-count`	2	3.54
A10_X4	`dedicated-unit-a10-count`	4	7.08
A100_40G_X1	`dedicated-unit-a100-40g-count`	2	2.70
A100_40G_X2	`dedicated-unit-a100-40g-count`	2	5.40
A100_40G_X4	`dedicated-unit-a100-40g-count`	4	10.8
A100_40G_X8	`dedicated-unit-a100-40g-count`	8	21.60
A100_80G_X1	`dedicated-unit-a100-80g-count`	1	3.24
A100_80G_X2	`dedicated-unit-a100-80g-count`	2	6.48
A100_80G_X4	`dedicated-unit-a100-80g-count`	4	12.96
A100_80G_X8	`dedicated-unit-a100-80g-count`	8	25.92
H100_X1	`dedicated-unit-h100-count`	1	6.01
H100_X2	`dedicated-unit-h100-count`	2	12.02
H100_X4	`dedicated-unit-h100-count`	4	24.04
H100_X8	`dedicated-unit-h100-count`	8	48.08
H200_X1	`dedicated-unit-h200-coun`t	1	6.22
H200_X2	`dedicated-unit-h200-coun`t	2	12.44
H200_X4	`dedicated-unit-h200-coun`t	4	24.88
H200_X8	`dedicated-unit-h200-coun`t	8	49.76

Tip

To request the resources for the recommended dedicated AI cluster unit size, see requesting a resource limit.

To calculate the price, multiply the price for AI Unit Per Hour for Oracle Cloud Infrastructure Generative AI - Model Import on the Pricing Page Information page to the AI unit count on this page.

Tasks for Importing a Model

Import the model using one of these options:
- From Hugging Face
- From an Object Storage bucket
Create a hosting dedicated AI cluster for the imported model with a recommended unit shape.
Create an endpoint.
Call the model through OCI Generative AI API, SDK, or use the model in the playground.

Managing the Imported Models

After performing the prerequisites and importing a model, you can perform the following tasks on the imported models: