OCI Search with OpenSearch supports semantic search
starting with OpenSearch version 2.11.
With semantic search, search engines use the context and content of search queries to
better understand the meaning of a query, compared to keyword search, where search results are
based on content matching keywords in a query. OpenSearch implements semantic search using
neural search, which is a technique that uses large language models
to understand the relationships between terms. For more information about neural search in
OpenSearch, see Neural search tutorial.
Using Neural Search in OCI Search with OpenSearch
To use neural search for semantic search in OCI Search with OpenSearch, you need to:
Register and deploy your choice of model to the cluster.
Create an index and set up an ingestion pipeline using the deployed model. Use the ingestion
pipeline to ingest documents into the index.
Run semantic search queries on the index using either hybrid search or neural search.
Prerequisites 🔗
To use semantic search, the OpenSearch version for the cluster must be 2.11 or newer. By
default, new clusters use version 2.11. See Creating an OpenSearch Cluster.
For existing clusters configured for version 2.3, you can perform an inline upgrade to version 2.11, for more information, see OpenSearch Cluster Software Upgrades.
To upgrade existing clusters configured for version 1.2.3 to 2.11, you need to use the upgrade process described in OpenSearch Cluster Software Upgrades.
Before you start setting up the model for semantic search, you need to complete the prerequisites, which include specifying the applicable IAM policy if required, and configuring the recommended cluster settings.
IAM Policy for Custom Models and Generative AI
Connectors 🔗
If you're using a custom model, you need to grant access
for OCI Search with OpenSearch to access to the Object Storage
bucket that contains the model.
The following policy example includes the required permission:
ALLOW ANY-USER to manage object-family in tenancy WHERE ALL {request.principal.type='opensearchcluster', request.resource.compartment.id='<cluster_compartment_id>'}
IAM Policy for Generative AI Connectors 🔗
If you're using a Generative AI connector, you need
to grant access for OCI Search with OpenSearch to access
Generative AI resources.
The following policy example includes the required permission:
ALLOW ANY-USER to manage generative-ai-family in tenancy WHERE ALL {request.principal.type='opensearchcluster', request.resource.compartment.id='<cluster_compartment_id>'}
Regions for Generative AI Connectors 🔗
To use OCI Generative AI, your tenancy must be subscribed to the US Midwest (Chicago) region or the Germany Central (Frankfurt) region. You don't need to create the cluster
in either of those regions, just ensure that your tenancy is subscribed to one of the
regions.
Cluster Settings for Semantic Search 🔗
Use the settings operation of the Cluster APIs to configure the recommended cluster settings for
semantic search. The following example includes the recommended settings:
The first step when configuring neural search is setting up the large language model you want
to use. The model is used to generate vector embeddings from text fields.
Register a Model Group 🔗
Model groups enable you to manage access to specific
models. Registering a model group is optional, however if you don't register a model group,
ML Commons creates registers a new model group for you, so we recommend that you register
the model group.
Register a model group using the register operation in the Model Group APIs, as shown in the following
example:
POST /_plugins/_ml/model_groups/_register
{
"name": "new_model_group",
"description": "A model group for local models"
}
Make note of the model_group_id returned in the response:
Register
the model using the register operation from the Model APIs. The content of the POST request to the
register operation depends on the type of model you're using.
Option 1: Built-in OpenSearch pretrained models
Several pretrained sentence transformer
models are available for you to directly register and deploy to a cluster
without needing to download and then upload them manually into a private storage bucket,
unlike the process required for the custom models option. With this option, when you
register a pretrained model, you only need the model's model_group_id,
name, version, and model_format.
See Using an OpenSearch Pretrained Model for how to use a
pretrained model.
Option 2: Custom models
You need to pass the Object Storage URL in the actions section in the
register operation, as follows:
To use a Generative AI connector to register a remote embedding model such as the
cohere.embed-english-v3.0 model, you need to create a connector first
and then register the model, using the following steps:
To use a dedicated Generative AI model endpoint, reconfigure the connector payload with
the following changes:
Use endpointId instead of modelId, and then specify the
dedicated model endpoint's OCID instead of the model name. For example,
change:
\"modelId\": \"${parameters.model}\"
to:
\"endpointId\":\"<dedicated_model_enpoint_OCID>\"
Change servingType from ON_DEMAND to
DEDICATED. For example,
change:
\"servingType\":\"ON_DEMAND\"
to:
\"servingType\":\"DEDICATED\"
After you make the register request, you can check the status of the operation, use the
task_id with the Get operation of the Tasks APIs, as shown in the following example:
GET /_plugins/_ml/tasks/<task_ID>
When
the register operation is complete, the status value in the response to the
Get operation is COMPLETED, as shown the following example:
Make note of the model_id value returned in the response to
use when you deploy the model.
Deploy the Model 🔗
After the register operation is completed for the model, you can deploy the model to the
cluster using the deploy operation of the Model APIs, passing the
model_id from the Get operation response in the previous step, as shown
in the following example:
POST /_plugins/_ml/models/<embedding_model_ID>/_deploy
Make note of the task_id returned in the response, you can use the
task_id to check the status of the operation.
to check the status of the register operation, use the task_id with the
Get operation of the Tasks APIs, as shown in the following example:
GET /_plugins/_ml/tasks/<task_ID>
When the deploy operation is complete, the status value in the response to
the Get operation is COMPLETED.
Ingest Data 🔗
The first step when configuring neural search is setting up the large language model you want
to use. The model is used to generate vector embeddings from text fields.
Create Ingestion Pipeline 🔗
Using the model ID of the model deployed, create an ingestion pipeline as shown in the
following example:
The ingestion pipeline defines a processor and the field mappings (in this case
"passage_text" → "passage_embedding" ) This means if you use this pipeline on a
specific index to ingest data, the pipeline automatically finds the "passage_text"
field, and use the pipeline model to generate the corresponding embeddings,
"passage_embedding", and then maps them before indexing.
Remember "passage_text" → "passage_embedding" are user defined and can be anything
you want. Ensure that you're consistent with the naming while creating the index where you
plan to use the pipeline. The pipeline processor needs to be able to map the fields as
described.
Create an Index 🔗
During the index creation, you can specify the pipeline you want to use to ingest documents
into the index.
The following API call example shows how to create an index using the test-nlp-pipeline
pipeline created in the previous step.
When creating the index, you also need to specify which library implementation of
approximate nearest neighbor (ANN) you want to use. OCI Search with OpenSearch supports NMSLIB, Faiss, and Lucene
libraries, for more information, see Search Engines.
After successfully creating
an index, you can now ingest data into the index as shown in the following example:
POST /test-index/_doc/1
{
"passage_text": "there are many sharks in the ocean"
}
POST /test-index/_doc/2
{
"passage_text": "fishes must love swimming"
}
POST /test-index/_doc/3
{
"passage_text": "summers are usually very hot"
}
POST /test-index/_doc/4
{
"passage_text": "florida has a nice weather all year round"
}
Use a GET to verify that the documents are being ingested correctly and
embeddings are getting auto generated during
ingestion: