Configuring agents for semantic search
When you have vector indexes with embeddings and want agentic search to automatically perform semantic searches based on user intent, you need to configure your agent with embedding model information. This allows the agent to generate neural queries that search for semantic similarity rather than exact text matches, providing more relevant results for conceptual questions.
When you configure agents for semantic search, the agents choose between traditional keyword searches and semantic vector searches at query time.
Even when an embedding model ID is provided, the agent autonomously decides whether to use neural (semantic) search or lexical search based on the query intent and context. For example, date filters or exact-match queries will use lexical search, while conceptual queries will use neural search.
PREREQUISITE
Before using semantic search, you must set up a text embedding model. For more information, see Choosing a model.
Step 1: Configure a vector index
First, configure a vector index.
Step 1(a): Create an embedding model
Register an embedding model that will convert text into vector representations for semantic search:
POST /_plugins/_ml/models/_register
{
"name": "Bedrock embedding model",
"function_name": "remote",
"description": "Bedrock text embedding model v2",
"connector": {
"name": "Amazon Bedrock Connector: embedding",
"description": "The connector to bedrock Titan embedding model",
"version": 1,
"protocol": "aws_sigv4",
"parameters": {
"region": "your-aws-region",
"service_name": "bedrock",
"model": "amazon.titan-embed-text-v2:0",
"dimensions": 1024,
"normalize": true,
"embeddingTypes": [
"float"
]
},
"credential": {
"access_key": "your-access-key",
"secret_key": "your-secret-key",
"session_token": "your-session-token"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://bedrock-runtime.${parameters.region}.amazonaws.com/model/${parameters.model}/invoke",
"headers": {
"content-type": "application/json",
"x-amz-content-sha256": "required"
},
"request_body": "{ \"inputText\": \"${parameters.inputText}\", \"dimensions\": ${parameters.dimensions}, \"normalize\": ${parameters.normalize}, \"embeddingTypes\": ${parameters.embeddingTypes} }",
"pre_process_function": "connector.pre_process.bedrock.embedding",
"post_process_function": "connector.post_process.bedrock.embedding"
}
]
}
}
Step 1(b): Create an ingest pipeline
Create an ingest pipeline that automatically generates embeddings for text fields during document ingestion:
PUT /_ingest/pipeline/my_bedrock_embedding_pipeline
{
"description": "text embedding pipeline",
"processors": [
{
"text_embedding": {
"model_id": "fxzel5kB-5P992SCH-qM",
"field_map": {
"content_text": "content_embedding"
}
}
}
]
}
Step 1(c): Create a vector index with an ingest pipeline
Create a vector index with mappings for both text content and vector embeddings, using the ingest pipeline to automatically process documents:
PUT /research_papers
{
"settings": {
"index": {
"default_pipeline": "my_bedrock_embedding_pipeline",
"knn": "true"
}
},
"mappings": {
"properties": {
"content_embedding": {
"type": "knn_vector",
"dimension": 1024,
"method": {
"name": "hnsw",
"engine": "lucene"
}
},
"published_date": {
"type": "date"
},
"rating": {
"type": "integer"
}
}
}
}
Step 1(d): Ingest data into the vector index
Add research paper documents to the index. The ingest pipeline will automatically generate embeddings for the content_text field:
POST /_bulk
{ "index": { "_index": "research_papers", "_id": "1" } }
{ "content_text": "Autonomous robotic systems for warehouse automation and industrial manufacturing", "published_date": "2024-05-15", "rating": 5 }
{ "index": { "_index": "research_papers", "_id": "2" } }
{ "content_text": "Gene expression analysis and CRISPR-Cas9 genome editing applications in cancer research", "published_date": "2024-06-02", "rating": 4 }
{ "index": { "_index": "research_papers", "_id": "3" } }
{ "content_text": "Reinforcement learning algorithms for sequential decision making and optimization problems", "published_date": "2024-03-20", "rating": 5 }
{ "index": { "_index": "research_papers", "_id": "4" } }
{ "content_text": "Climate change impact on coral reef ecosystems and marine biodiversity conservation", "published_date": "2024-04-10", "rating": 4 }
{ "index": { "_index": "research_papers", "_id": "5" } }
{ "content_text": "Tectonic plate movements and earthquake prediction using geological fault analysis", "published_date": "2024-01-22", "rating": 4 }
Step 2: Configure agentic search
Next, configure agentic search.
Step 2(a): Create a model for agentic search
Register a model that will be used by both the conversational agent and the QueryPlanningTool:
POST /_plugins/_ml/models/_register
{
"name": "My OpenAI model: gpt-5",
"function_name": "remote",
"description": "Model for agentic search with neural queries",
"connector": {
"name": "My openai connector: gpt-5",
"description": "The connector to openai chat model",
"version": 1,
"protocol": "http",
"parameters": {
"model": "gpt-5"
},
"credential": {
"openAI_key": "<OPEN AI KEY>"
},
"actions": [
{
"action_type": "predict",
"method": "POST",
"url": "https://api.openai.com/v1/chat/completions",
"headers": {
"Authorization": "Bearer ${credential.openAI_key}"
},
"request_body": "{ \"model\": \"${parameters.model}\", \"messages\": [{\"role\":\"developer\",\"content\":\"${parameters.system_prompt}\"},${parameters._chat_history:-}{\"role\":\"user\",\"content\":\"${parameters.user_prompt}\"}${parameters._interactions:-}], \"reasoning_effort\":\"low\"${parameters.tool_configs:-}}"
}
]
}
}
Step 2(b): Create an agent
Create an agent for agentic search. To enable the agent to perform semantic searches using neural queries, you need to configure an embedding model using one of the following methods:
- Option 1: Configure an embedding model in the search pipeline (recommended for easier updates).
- Option 2: Configure an embedding model in the agent configuration.
Option 1: Create an agent without an embedding model ID (recommended)
Use this option if you plan to specify the embedding_model_id in the search pipeline:
POST /_plugins/_ml/agents/_register
{
"name": "GPT 5 Agent for Agentic Search",
"type": "conversational",
"description": "Use this for Agentic Search",
"llm": {
"model_id": "your-agent-model-id",
"parameters": {
"max_iteration": 15
}
},
"memory": {
"type": "conversation_index"
},
"parameters": {
"_llm_interface": "openai/v1/chat/completions"
},
"tools": [
{
"type": "QueryPlanningTool",
"parameters": {
"model_id": "your-qpt-model-id"
}
}
],
"app_type": "os_chat"
}
Option 2: Create an agent with an embedding model ID
Alternatively, include the embedding_model_id in the agent’s llm.parameters:
POST /_plugins/_ml/agents/_register
{
"name": "GPT 5 Agent for Agentic Search",
"type": "conversational",
"description": "Use this for Agentic Search",
"llm": {
"model_id": "your-agent-model-id",
"parameters": {
"max_iteration": 15,
"embedding_model_id": "your-embedding-model-id-from-step1"
}
},
"memory": {
"type": "conversation_index"
},
"parameters": {
"_llm_interface": "openai/v1/chat/completions"
},
"tools": [
{
"type": "QueryPlanningTool",
"parameters": {
"model_id": "your-qpt-model-id"
}
}
],
"app_type": "os_chat"
}
Step 2(c): Create a search pipeline
Create a search pipeline with the agentic_query_translator processor. For more information, see Agentic query translator processor.
If you used Option 1 in Step 2(b) (recommended): Include the embedding_model_id in the search pipeline:
PUT _search/pipeline/my_pipeline
{
"request_processors": [
{
"agentic_query_translator": {
"agent_id": "your-agent-id-from-step-2b",
"embedding_model_id": "your-embedding-model-id-from-step1"
}
}
]
}
If you used Option 2 in Step 2(b): Create the search pipeline without the embedding_model_id:
PUT _search/pipeline/my_pipeline
{
"request_processors": [
{
"agentic_query_translator": {
"agent_id": "your-agent-id-from-step-2b"
}
}
]
}
If you specify the embedding_model_id in both the agent and the search pipeline, the search pipeline configuration takes precedence.
Step 3: Run an agentic search
Run various configurations of agentic search.
Run a semantic search
Perform agentic search with a question that requires semantic understanding:
POST /research_papers/_search?search_pipeline=my_pipeline
{
"query": {
"agentic": {
"query_text": "Show me 3 robots training related research papers "
}
}
}
The agent successfully identifies that semantic search is needed. The ext object demonstrates that the QueryPlanningTool successfully generated a neural query using the embedding model ID. The response includes matching research papers ranked by semantic similarity:
{
"took": 10509,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 5,
"relation": "eq"
},
"max_score": 0.40031588,
"hits": [
{
"_index": "research_papers",
"_id": "1",
"_score": 0.40031588,
"_source": {
"content_text": "Autonomous robotic systems for warehouse automation and industrial manufacturing",
"rating": 5,
"content_embedding": ["<redacted>"],
"published_date": "2024-05-15"
}
},
{
"_index": "research_papers",
"_id": "3",
"_score": 0.36390686,
"_source": {
"content_text": "Reinforcement learning algorithms for sequential decision making and optimization problems",
"rating": 5,
"content_embedding": ["<redacted>"],
"published_date": "2024-03-20"
}
},
{
"_index": "research_papers",
"_id": "5",
"_score": 0.34401828,
"_source": {
"content_text": "Tectonic plate movements and earthquake prediction using geological fault analysis",
"rating": 4,
"content_embedding": ["<redacted>"],
"published_date": "2024-01-22"
}
}
]
},
"ext": {
"agent_steps_summary": "I have these tools available: [ListIndexTool, IndexMappingTool, query_planner_tool]\nFirst I used: ListIndexTool — input: \"[]\"; context gained: \"Found indices; 'research_papers' appears relevant\"\nSecond I used: IndexMappingTool — input: \"[\"research_papers\"]\"; context gained: \"Index has text content and an embedding field suitable for neural search\"\nThird I used: query_planner_tool — qpt.question: \"Show me 3 research papers related to robots training.\"; index_name_provided: \"research_papers\"\nValidation: qpt output is valid and limits results to 3 using neural search with the provided model.",
"memory_id": "jhzpl5kB-5P992SCwOqe",
"dsl_query": "{\"size\":3.0,\"query\":{\"neural\":{\"content_embedding\":{\"model_id\":\"fxzel5kB-5P992SCH-qM\",\"k\":100.0,\"query_text\":\"robots training\"}}}}"
}
}
Run a traditional search with filters
Next, perform agentic search with a question that requires filtering rather than semantic understanding:
POST /research_papers/_search?search_pipeline=my_pipeline
{
"query": {
"agentic": {
"query_text": "Show me papers published after 2024 May"
}
}
}
The agent recognizes the query as a date-based filter query and generates a traditional range query instead of a neural query:
{
"took": 8522,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "research_papers",
"_id": "2",
"_score": null,
"_source": {
"content_text": "Gene expression analysis and CRISPR-Cas9 genome editing applications in cancer research",
"rating": 4,
"content_embedding": ["<redacted>"],
"published_date": "2024-06-02"
},
"sort": [
1717286400000
]
}
]
},
"ext": {
"agent_steps_summary": "I have these tools available: [ListIndexTool, IndexMappingTool, query_planner_tool]\nFirst I used: query_planner_tool — qpt.question: \"Show me papers published after May 2024.\"; index_name_provided: \"research_papers\"\nValidation: qpt output is valid JSON and matches the user request with the specified date filter and sorting.",
"memory_id": "vBzyl5kB-5P992SCI-o1",
"dsl_query": "{\"size\":10.0,\"query\":{\"bool\":{\"filter\":[{\"range\":{\"published_date\":{\"gt\":\"2024-05-31T23:59:59Z\"}}}]}},\"sort\":[{\"published_date\":{\"order\":\"desc\"}}]}"
}
}
Specify embedding models in query text
To override the embedding model ID, you can include it directly in the natural language query_text when sending a query. This takes precedence over any embedding_model_id configured in the search pipeline or agent:
POST /research_papers/_search?search_pipeline=my_pipeline
{
"query": {
"agentic": {
"query_text": "Show me 3 robots training related research papers use this model id for neural search:fxzel5kB-5P992SCH-qM "
}
}
}
The agent successfully extracts the embedding model ID directly from the query text and generates the appropriate neural DSL query:
{
"took": 14989,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"max_score": 0.38957736,
"hits": [
{
"_index": "research_papers",
"_id": "1",
"_score": 0.38957736,
"_source": {
"content_text": "Autonomous robotic systems for warehouse automation and industrial manufacturing",
"rating": 5,
"content_embedding": [],
"published_date": "2024-05-15"
}
},
{
"_index": "research_papers",
"_id": "3",
"_score": 0.36386627,
"_source": {
"content_text": "Reinforcement learning algorithms for sequential decision making and optimization problems",
"rating": 5,
"content_embedding": [],
"published_date": "2024-03-20"
}
},
{
"_index": "research_papers",
"_id": "2",
"_score": 0.35789147,
"_source": {
"content_text": "Gene expression analysis and CRISPR-Cas9 genome editing applications in cancer research",
"rating": 4,
"content_embedding": [],
"published_date": "2024-06-02"
}
}
]
},
"ext": {
"agent_steps_summary": "I have these tools available: [ListIndexTool, IndexMappingTool, query_planner_tool]\nFirst I used: ListIndexTool — input: \"\"; context gained: \"Found indices, including research_papers with 5 documents\"\nSecond I used: IndexMappingTool — input: \"research_papers\"; context gained: \"Index exists and contains text and embedding fields suitable for neural search\"\nThird I used: query_planner_tool — qpt.question: \"Show me 3 research papers related to robot training.\"; index_name_provided: \"research_papers\"\nValidation: qpt output is valid neural search DSL using the provided model ID and limits results to 3.",
"memory_id": "whz1l5kB-5P992SCPOqn",
"dsl_query": "{\"size\":3.0,\"query\":{\"neural\":{\"content_embedding\":{\"model_id\":\"fxzel5kB-5P992SCH-qM\",\"k\":100.0,\"query_text\":\"research papers related to robot training\"}}},\"sort\":[{\"_score\":{\"order\":\"desc\"}}],\"track_total_hits\":false}"
}
}