Great for experimenting and evaluating the models.
Pay as you go for transactions. See the following note for details.
Note
With on-demand inferencing you pay as you go for the following character
lengths:
Chat:prompt length (in characters) + response length (in characters)
Text generation:prompt length (in characters) + response length (in characters)
Summarization:prompt length (in characters) + response length (in characters)
Text Embeddings:input length (in characters)
On the Pricing page, 1 character is calculated as 1 transaction.
If you're hosting foundational models or fine-tuning them on dedicated AI clusters, you're charged by the unit hour rather than by transaction. In this case, see Paying for Dedicated AI Clusters to learn how to calculate the dedicated AI cluster costs.
Matching Models to On-Demand Prices
See the following tables to match a foundational model to its product name on the pricing page. The pricing page lists the price for 10,000 on-demand transactions when using the playground, API, or CLI for inferencing. Then, review the examples in this section to learn how to calculate the cost based on the number of input and output characters.
Paul calls the meta.llama-3.3-70b-instruct model with the following prompt, which is 220 characters long:
Generate a product pitch for a USB connected compact microphone that can record
surround sound. The microphone is most useful in recording music or
conversations. The microphone can also be useful for recording podcasts.
The response from the model is 2,205 characters long. Paul wants to know the cost for this call. Here are the steps to calculate the cost.
Calculate the prompt + response length (in characters).
Let's add up the prompt length (220 characters) and the model response length
(1,618 characters).
10,000 transactions = 10,000 characters, so 1 transaction = 1 character
2,425 characters = 2,425 transactions
Go to AI Pricing and under OCI Generative AI, for Oracle Cloud Infrastructure Generative AI - Large Meta, find the <Large-Meta-unit-price>.
Paul uses the meta.llama-3.3-70b-instruct model which matches to the product, Generative AI
OCI - Large Meta on the AI Pricing page for Generative AI.
Calculate the price for 1,838 characters.
price = (2,425 transactions )/ (10,000 transactions) x $<Large-Meta-unit-price>
Tip
In addition to calculating the price, you can estimate the cost by selecting the AI and Machine Learning category and loading the cost estimator for OCI
Generative AI.
Text Embeddings Example 🔗
Gina is converting customer contracts into embeddings for a new semantic search application. On average, Gina ingests 16 documents every hour. Each document is about 1,000 characters long. Gina wants to get an estimate of the monthly bill for generating those embeddings. Here are the steps to calculate the cost.
Calculate the input length (in characters).
Let's add up the input character length for each hour.
input character length for 16 documents = 16 x 1,000 = 16,000 characters per hour
Go to AI Pricing and under OCI Generative AI, for Oracle Cloud Infrastructure Generative AI - Embed Cohere, find the <Embed-Cohere-unit-price>.
Gina uses the cohere.embed model which matches to the product, Oracle Cloud Infrastructure
Generative AI - Embed Cohere on the AI Pricing page for Generative AI.
Calculate the number of transactions per hour.
Gina ingests 16,000 characters per hour. Prices are listed for 10,000 transactions.
10,000 transactions = 10,000 characters, so 1 transaction = 1 character
16,000 characters = 16,000 transactions
Find the hourly price for the 16,000 characters that Gina ingests hourly.