Configuring agents for semantic search

When you have vector indexes with embeddings and want agentic search to automatically perform semantic searches based on user intent, you need to configure your agent with embedding model information. This allows the agent to generate neural queries that search for semantic similarity rather than exact text matches, providing more relevant results for conceptual questions.

When you configure agents for semantic search, the agents choose between traditional keyword searches and semantic vector searches at query time.

Even when an embedding model ID is provided, the agent autonomously decides whether to use neural (semantic) search or lexical search based on the query intent and context. For example, date filters or exact-match queries will use lexical search, while conceptual queries will use neural search.

PREREQUISITE
Before using semantic search, you must set up a text embedding model. For more information, see Choosing a model.

Step 1: Configure a vector index

First, configure a vector index.

Step 1(a): Create an embedding model

POST /_plugins/_ml/models/_register
{
  "name": "Bedrock embedding model",
  "function_name": "remote",
  "description": "Bedrock text embedding model v2",
  "connector": {
    "name": "Amazon Bedrock Connector: embedding",
    "description": "The connector to bedrock Titan embedding model",
    "version": 1,
    "protocol": "aws_sigv4",
    "parameters": {
      "region": "your-aws-region",
      "service_name": "bedrock",
      "model": "amazon.titan-embed-text-v2:0",
      "dimensions": 1024,
      "normalize": true,
      "embeddingTypes": [
        "float"
      ]
    },
    "credential": {
      "access_key": "your-access-key",
      "secret_key": "your-secret-key",
      "session_token": "your-session-token"
    },
    "actions": [
      {
        "action_type": "predict",
        "method": "POST",
        "url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke",
        "headers": {
          "content-type": "application/json",
          "x-amz-content-sha256": "required"
        },
        "request_body": "{ \"inputText\": \"${parameters.inputText}\", \"dimensions\": ${parameters.dimensions}, \"normalize\": ${parameters.normalize}, \"embeddingTypes\": ${parameters.embeddingTypes} }",
        "pre_process_function": "connector.pre_process.bedrock.embedding",
        "post_process_function": "connector.post_process.bedrock.embedding"
      }
    ]
  }
}

Step 1(b): Create an ingest pipeline

Create an ingest pipeline that automatically generates embeddings for text fields during document ingestion:

PUT /_ingest/pipeline/my_bedrock_embedding_pipeline
{
  "description": "text embedding pipeline",
  "processors": [
    {
      "text_embedding": {
        "model_id": "fxzel5kB-5P992SCH-qM",
        "field_map": {
          "content_text": "content_embedding"
        }
      }
    }
  ]
}

Step 1(c): Create a vector index with an ingest pipeline

Create a vector index with mappings for both text content and vector embeddings, using the ingest pipeline to automatically process documents:

PUT /research_papers
{
  "settings": {
    "index": {
      "default_pipeline": "my_bedrock_embedding_pipeline",
      "knn": "true"
    }
  },
  "mappings": {
    "properties": {
      "content_embedding": {
        "type": "knn_vector",
        "dimension": 1024,
        "method": {
          "name": "hnsw",
          "engine": "lucene"
        }
      },
      "published_date": {
        "type": "date"
      },
      "rating": {
        "type": "integer"
      }
    }
  }
}

Step 1(d): Ingest data into the vector index

Add research paper documents to the index. The ingest pipeline will automatically generate embeddings for the content_text field:

POST /_bulk
{ "index": { "_index": "research_papers", "_id": "1" } }
{ "content_text": "Autonomous robotic systems for warehouse automation and industrial manufacturing", "published_date": "2024-05-15", "rating": 5 }
{ "index": { "_index": "research_papers", "_id": "2" } }
{ "content_text": "Gene expression analysis and CRISPR-Cas9 genome editing applications in cancer research", "published_date": "2024-06-02", "rating": 4 }
{ "index": { "_index": "research_papers", "_id": "3" } }
{ "content_text": "Reinforcement learning algorithms for sequential decision making and optimization problems", "published_date": "2024-03-20", "rating": 5 }
{ "index": { "_index": "research_papers", "_id": "4" } }
{ "content_text": "Climate change impact on coral reef ecosystems and marine biodiversity conservation", "published_date": "2024-04-10", "rating": 4 }
{ "index": { "_index": "research_papers", "_id": "5" } }
{ "content_text": "Tectonic plate movements and earthquake prediction using geological fault analysis", "published_date": "2024-01-22", "rating": 4 }

Step 2: Configure agentic search

Next, configure agentic search.

Step 2(a): Create a model for agentic search

POST /_plugins/_ml/models/_register
{
  "name": "My OpenAI model: gpt-5",
  "function_name": "remote",
  "description": "Model for agentic search with neural queries",
  "connector": {
    "name": "My openai connector: gpt-5",
    "description": "The connector to openai chat model",
    "version": 1,
    "protocol": "http",
    "parameters": {
      "model": "gpt-5"
    },
    "credential": {
      "openAI_key": "<OPEN AI KEY>"
    },
    "actions": [
      {
        "action_type": "predict",
        "method": "POST",
        "url": "https://api.openai.com/v1/chat/completions",
        "headers": {
          "Authorization": "Bearer ${credential.openAI_key}"
        },
        "request_body": "{ \"model\": \"${parameters.model}\", \"messages\": [{\"role\":\"developer\",\"content\":\"${parameters.system_prompt}\"},${parameters._chat_history:-}{\"role\":\"user\",\"content\":\"${parameters.user_prompt}\"}${parameters._interactions:-}], \"reasoning_effort\":\"low\"${parameters.tool_configs:-}}"
      }
    ]
  }
}

Step 2(b): Create an agent

Create an agent for agentic search. To enable the agent to perform semantic searches using neural queries, you need to configure an embedding model using one of the following methods:

Option 1: Configure an embedding model in the search pipeline (recommended for easier updates).
Option 2: Configure an embedding model in the agent configuration.

Option 1: Create an agent without an embedding model ID (recommended)

Use this option if you plan to specify the embedding_model_id in the search pipeline:

POST /_plugins/_ml/agents/_register
{
  "name": "GPT 5 Agent for Agentic Search",
  "type": "conversational",
  "description": "Use this for Agentic Search",
  "llm": {
    "model_id": "your-agent-model-id",
    "parameters": {
      "max_iteration": 15
    }
  },
  "memory": {
    "type": "conversation_index"
  },
  "parameters": {
    "_llm_interface": "openai/v1/chat/completions"
  },
  "tools": [
    {
      "type": "QueryPlanningTool",
      "parameters": {
        "model_id": "your-qpt-model-id"
      }
    }
  ],
  "app_type": "os_chat"
}

Option 2: Create an agent with an embedding model ID

Alternatively, include the embedding_model_id in the agent’s llm.parameters:

POST /_plugins/_ml/agents/_register
{
  "name": "GPT 5 Agent for Agentic Search",
  "type": "conversational",
  "description": "Use this for Agentic Search",
  "llm": {
    "model_id": "your-agent-model-id",
    "parameters": {
      "max_iteration": 15,
      "embedding_model_id": "your-embedding-model-id-from-step1"
    }
  },
  "memory": {
    "type": "conversation_index"
  },
  "parameters": {
    "_llm_interface": "openai/v1/chat/completions"
  },
  "tools": [
    {
      "type": "QueryPlanningTool",
      "parameters": {
        "model_id": "your-qpt-model-id"
      }
    }
  ],
  "app_type": "os_chat"
}

Step 2(c): Create a search pipeline

Create a search pipeline with the agentic_query_translator processor. For more information, see Agentic query translator processor.

If you used Option 1 in Step 2(b) (recommended): Include the embedding_model_id in the search pipeline:

PUT _search/pipeline/my_pipeline
{
  "request_processors": [
    {
      "agentic_query_translator": {
        "agent_id": "your-agent-id-from-step-2b",
        "embedding_model_id": "your-embedding-model-id-from-step1"
      }
    }
  ]
}

If you used Option 2 in Step 2(b): Create the search pipeline without the embedding_model_id:

PUT _search/pipeline/my_pipeline
{
  "request_processors": [
    {
      "agentic_query_translator": {
        "agent_id": "your-agent-id-from-step-2b"
      }
    }
  ]
}

If you specify the embedding_model_id in both the agent and the search pipeline, the search pipeline configuration takes precedence.

Step 3: Run an agentic search

Run various configurations of agentic search.

Run a semantic search

Perform agentic search with a question that requires semantic understanding:

POST /research_papers/_search?search_pipeline=my_pipeline
{
  "query": {
    "agentic": {
      "query_text": "Show me 3 robots training related research papers "
    }
  }
}

The agent successfully identifies that semantic search is needed. The ext object demonstrates that the QueryPlanningTool successfully generated a neural query using the embedding model ID. The response includes matching research papers ranked by semantic similarity:

{
  "took": 10509,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 5,
      "relation": "eq"
    },
    "max_score": 0.40031588,
    "hits": [
      {
        "_index": "research_papers",
        "_id": "1",
        "_score": 0.40031588,
        "_source": {
          "content_text": "Autonomous robotic systems for warehouse automation and industrial manufacturing",
          "rating": 5,
          "content_embedding": ["<redacted>"],
          "published_date": "2024-05-15"
        }
      },
      {
        "_index": "research_papers",
        "_id": "3",
        "_score": 0.36390686,
        "_source": {
          "content_text": "Reinforcement learning algorithms for sequential decision making and optimization problems",
          "rating": 5,
          "content_embedding": ["<redacted>"],
          "published_date": "2024-03-20"
        }
      },
      {
        "_index": "research_papers",
        "_id": "5",
        "_score": 0.34401828,
        "_source": {
          "content_text": "Tectonic plate movements and earthquake prediction using geological fault analysis",
          "rating": 4,
          "content_embedding": ["<redacted>"],
          "published_date": "2024-01-22"
        }
      }
    ]
  },
  "ext": {
    "agent_steps_summary": "I have these tools available: [ListIndexTool, IndexMappingTool, query_planner_tool]\nFirst I used: ListIndexTool — input: \"[]\"; context gained: \"Found indices; 'research_papers' appears relevant\"\nSecond I used: IndexMappingTool — input: \"[\"research_papers\"]\"; context gained: \"Index has text content and an embedding field suitable for neural search\"\nThird I used: query_planner_tool — qpt.question: \"Show me 3 research papers related to robots training.\"; index_name_provided: \"research_papers\"\nValidation: qpt output is valid and limits results to 3 using neural search with the provided model.",
    "memory_id": "jhzpl5kB-5P992SCwOqe",
    "dsl_query": "{\"size\":3.0,\"query\":{\"neural\":{\"content_embedding\":{\"model_id\":\"fxzel5kB-5P992SCH-qM\",\"k\":100.0,\"query_text\":\"robots training\"}}}}"
  }
}

Run a traditional search with filters

Next, perform agentic search with a question that requires filtering rather than semantic understanding:

POST /research_papers/_search?search_pipeline=my_pipeline
{
  "query": {
    "agentic": {
      "query_text": "Show me papers published after 2024 May"
    }
  }
}

The agent recognizes the query as a date-based filter query and generates a traditional range query instead of a neural query:

{
  "took": 8522,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": null,
    "hits": [
      {
        "_index": "research_papers",
        "_id": "2",
        "_score": null,
        "_source": {
          "content_text": "Gene expression analysis and CRISPR-Cas9 genome editing applications in cancer research",
          "rating": 4,
          "content_embedding": ["<redacted>"],
          "published_date": "2024-06-02"
        },
        "sort": [
          1717286400000
        ]
      }
    ]
  },
  "ext": {
    "agent_steps_summary": "I have these tools available: [ListIndexTool, IndexMappingTool, query_planner_tool]\nFirst I used: query_planner_tool — qpt.question: \"Show me papers published after May 2024.\"; index_name_provided: \"research_papers\"\nValidation: qpt output is valid JSON and matches the user request with the specified date filter and sorting.",
    "memory_id": "vBzyl5kB-5P992SCI-o1",
    "dsl_query": "{\"size\":10.0,\"query\":{\"bool\":{\"filter\":[{\"range\":{\"published_date\":{\"gt\":\"2024-05-31T23:59:59Z\"}}}]}},\"sort\":[{\"published_date\":{\"order\":\"desc\"}}]}"
  }
}

Specify embedding models in query text

To override the embedding model ID, you can include it directly in the natural language query_text when sending a query. This takes precedence over any embedding_model_id configured in the search pipeline or agent:

POST /research_papers/_search?search_pipeline=my_pipeline
{
  "query": {
    "agentic": {
      "query_text": "Show me 3 robots training related research papers use this model id for neural search:fxzel5kB-5P992SCH-qM "
    }
  }
}

The agent successfully extracts the embedding model ID directly from the query text and generates the appropriate neural DSL query:

{
  "took": 14989,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "max_score": 0.38957736,
    "hits": [
      {
        "_index": "research_papers",
        "_id": "1",
        "_score": 0.38957736,
        "_source": {
          "content_text": "Autonomous robotic systems for warehouse automation and industrial manufacturing",
          "rating": 5,
          "content_embedding": [],
          "published_date": "2024-05-15"
        }
      },
      {
        "_index": "research_papers",
        "_id": "3",
        "_score": 0.36386627,
        "_source": {
          "content_text": "Reinforcement learning algorithms for sequential decision making and optimization problems",
          "rating": 5,
          "content_embedding": [],
          "published_date": "2024-03-20"
        }
      },
      {
        "_index": "research_papers",
        "_id": "2",
        "_score": 0.35789147,
        "_source": {
          "content_text": "Gene expression analysis and CRISPR-Cas9 genome editing applications in cancer research",
          "rating": 4,
          "content_embedding": [],
          "published_date": "2024-06-02"
        }
      }
    ]
  },
  "ext": {
    "agent_steps_summary": "I have these tools available: [ListIndexTool, IndexMappingTool, query_planner_tool]\nFirst I used: ListIndexTool — input: \"\"; context gained: \"Found indices, including research_papers with 5 documents\"\nSecond I used: IndexMappingTool — input: \"research_papers\"; context gained: \"Index exists and contains text and embedding fields suitable for neural search\"\nThird I used: query_planner_tool — qpt.question: \"Show me 3 research papers related to robot training.\"; index_name_provided: \"research_papers\"\nValidation: qpt output is valid neural search DSL using the provided model ID and limits results to 3.",
    "memory_id": "whz1l5kB-5P992SCPOqn",
    "dsl_query": "{\"size\":3.0,\"query\":{\"neural\":{\"content_embedding\":{\"model_id\":\"fxzel5kB-5P992SCH-qM\",\"k\":100.0,\"query_text\":\"research papers related to robot training\"}}},\"sort\":[{\"_score\":{\"order\":\"desc\"}}],\"track_total_hits\":false}"
  }
}

Step 1: Configure a vector index
Step 2: Configure agentic search
Step 3: Run an agentic search
Related documentation

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Configuring agents for semantic search

Step 1: Configure a vector index

Step 1(a): Create an embedding model

Step 1(b): Create an ingest pipeline

Step 1(c): Create a vector index with an ingest pipeline

Step 1(d): Ingest data into the vector index

Step 2: Configure agentic search

Step 2(a): Create a model for agentic search

Step 2(b): Create an agent

Option 1: Create an agent without an embedding model ID (recommended)

Option 2: Create an agent with an embedding model ID

Step 2(c): Create a search pipeline

Step 3: Run an agentic search

Run a semantic search

Run a traditional search with filters

Specify embedding models in query text

OpenSearch Links

Get Involved

Resources

Contact Us

Configuring agents for semantic search

Step 1: Configure a vector index

Step 1(a): Create an embedding model

Step 1(b): Create an ingest pipeline

Step 1(c): Create a vector index with an ingest pipeline

Step 1(d): Ingest data into the vector index

Step 2: Configure agentic search

Step 2(a): Create a model for agentic search

Step 2(b): Create an agent

Option 1: Create an agent without an embedding model ID (recommended)

Option 2: Create an agent with an embedding model ID

Step 2(c): Create a search pipeline

Step 3: Run an agentic search

Run a semantic search

Run a traditional search with filters

Specify embedding models in query text

Related documentation