Cohere Rerank 3.5
The cohere.rerank.v3-5 model takes in a query and a list of texts and produces an ordered array with each text assigned a relevance score. The relevance score is how the model ranks the documents, that's, how well each text matches the query.
Available in These Commercial Regions
- Brazil East (Sao Paulo) (dedicated AI cluster only)
 - Germany Central (Frankfurt) (dedicated AI cluster only)
 - Japan Central (Osaka) (dedicated AI cluster only)
 - Saudi Arabia Central (Riyadh) (dedicated AI cluster only)
 - UK South (London) (dedicated AI cluster only)
 - US East (Ashburn) (dedicated AI cluster only)
 - US Midwest (Chicago) (dedicated AI cluster only)
 
Available in This Sovereign Region
This model is available in EU Sovereign Central (Frankfurt) (dedicated AI cluster only) through the following API endpoints:
- Inference API: 
https://inference.generativeai.eu-frankfurt-2.oci.oraclecloud.eu - Management API: 
https://generativeai.eu-frankfurt-2.oci.oraclecloud.eu 
In the API, for both the model name and model OCID use cohere.rerank.v3-5.
Learn about Oracle EU Sovereign Cloud.
Access this Model
Key Features
- Dedicated mode only.
 - Not available on-demand or in the playground.
 - Access the model that's hosted on a cluster through API and SDK.
 - For dedicated mode, create an endpoint on a hosting dedicated AI cluster, host the model on the cluster, and then run the RerankText API or its relevant SDK.
 
Dedicated AI Cluster for the Model
To reach a model through a dedicated AI cluster in any listed region, you must create an endpoint for that model on a dedicated AI cluster. For the cluster unit size that matches this model, see the following table.
| Base Model | Fine-Tuning Cluster | Hosting Cluster | Pricing Page Information | Request Cluster Limit Increase | 
|---|---|---|---|---|
  | 
Not available for fine-tuning | 
  | 
  | 
  | 
If you don't have enough cluster limits in your tenancy for hosting the Cohere Rerank 3.5 model on a dedicated AI cluster, request the dedicated-unit-rerank-cohere-count limit to increase by 1.
Endpoint Rules for Clusters
- A dedicated AI cluster can hold up to 50 endpoints.
 - Use these endpoints to create aliases that all point either to the same base model or to the same version of a custom model, but not both types.
 - Several endpoints for the same model make it easy to assign them to different users or purposes.
 
| Hosting Cluster Unit Size | Endpoint Rules | 
|---|---|
| RERANK_COHERE | 
  | 
- 
To increase the call volume supported by a hosting cluster, increase its instance count by editing the dedicated AI cluster. See Updating a Dedicated AI Cluster.
 - 
For more than 50 endpoints per cluster, request an increase for the limit,
endpoint-per-dedicated-unit-count. See Requesting a Service Limit Increase and Service Limits for Generative AI. 
Cluster Performance Benchmarks
Review the Cohere Rerank 3.5 cluster performance benchmarks for different scenarios.
Release and Retirement Dates
| Model | Release Date | On-Demand Retirement Date | Dedicated Mode Retirement Date | 
|---|---|---|---|
cohere.rerank.v3-5
 | 
2025-05-14 | On-demand mode isn't available for this model. | At least 6 months after the release of the 1st replacement model. | 
Rerank Model Parameter
For the Rerank model parameters, see the RerankText API documentation.