Link Search Menu Expand Document Documentation Menu

Neural Search API

The Neural Search plugin provides several APIs for monitoring semantic and hybrid search features.

Stats

The Neural Search Stats API provides information about the current status of the Neural Search plugin. This includes both cluster-level and node-level statistics. Cluster-level statistics have a single value for the entire cluster. Node-level statistics have a single value for each node in the cluster.

By default, the Neural Search Stats API is disabled through a cluster setting. To enable statistics collection, use the following command:

PUT /_cluster/settings
{
  "persistent": {
    "plugins.neural_search.stats_enabled": true
  }
}

To disable statistics collection, set the cluster setting to false. When disabled, all values are reset and new statistics are not collected.

Endpoints

GET /_plugins/_neural/stats
GET /_plugins/_neural/stats/<stats>
GET /_plugins/_neural/<nodes>/stats
GET /_plugins/_neural/<nodes>/stats/<stats>

Path parameters

The following table lists the available path parameters. All path parameters are optional.

Parameter Data type Description
nodes String A node or a list of nodes (comma-separated) to filter statistics by. Default is all nodes.
stats String A statistic name or names (comma-separated) to return. Default is all statistics.

Query parameters

The following table lists the available query parameters. All query parameters are optional.

Parameter Data type Description
include_metadata Boolean When true, includes additional metadata fields for each statistic (see Available metadata). Default is false.
flat_stat_paths Boolean When true, flattens the JSON response structure for easier parsing. Default is false.
include_individual_nodes Boolean When true, includes statistics for individual nodes in the nodes category. When false, excludes the nodes category from the response. Default is true.
include_all_nodes Boolean When true, includes aggregated statistics across all nodes in the all_nodes category. When false, excludes the all_nodes category from the response. Default is true.
include_info Boolean When true, includes cluster-wide information in the info category. When false, excludes the info category from the response. Default is true.

Example request

GET /_plugins/_neural/node1,node2/stats/stat1,stat2?include_metadata=true,flat_stat_paths=true

Example response

Response
GET /_plugins/_neural/stats/
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": "3.1.0",
		"processors": {
			"search": {
				"hybrid": {
					"comb_geometric_processors": 0,
					"comb_rrf_processors": 0,
					"norm_l2_processors": 0,
					"norm_minmax_processors": 0,
					"comb_harmonic_processors": 0,
					"comb_arithmetic_processors": 0,
					"norm_zscore_processors": 0,
					"rank_based_normalization_processors": 0,
					"normalization_processors": 0
				},
				"rerank_ml_processors": 0,
				"rerank_by_field_processors": 0,
				"neural_sparse_two_phase_processors": 0,
				"neural_query_enricher_processors": 0
			},
			"ingest": {
				"sparse_encoding_processors": 0,
				"skip_existing_processors": 0,
				"text_image_embedding_processors": 0,
				"text_chunking_delimiter_processors": 0,
				"text_embedding_processors_in_pipelines": 0,
				"text_chunking_fixed_token_length_processors": 0,
				"text_chunking_fixed_char_length_processors": 0,
				"text_chunking_processors": 0
			}
		}
	},
	"all_nodes": {
		"query": {
			"hybrid": {
				"hybrid_query_with_pagination_requests": 0,
				"hybrid_query_with_filter_requests": 0,
				"hybrid_query_with_inner_hits_requests": 0,
				"hybrid_query_requests": 0
			},
			"neural": {
				"neural_query_against_semantic_sparse_requests": 0,
				"neural_query_requests": 0,
				"neural_query_against_semantic_dense_requests": 0,
				"neural_query_against_knn_requests": 0
			},
			"neural_sparse": {
				"neural_sparse_query_requests": 0,
                "seismic_query_requests": 0
			}
		},
		"semantic_highlighting": {
			"semantic_highlighting_request_count": 0,
			"semantic_highlighting_batch_request_count": 0
		},
		"processors": {
			"search": {
				"neural_sparse_two_phase_executions": 0,
				"hybrid": {
					"comb_harmonic_executions": 0,
					"norm_zscore_executions": 0,
					"comb_rrf_executions": 0,
					"norm_l2_executions": 0,
					"rank_based_normalization_processor_executions": 0,
					"comb_arithmetic_executions": 0,
					"normalization_processor_executions": 0,
					"comb_geometric_executions": 0,
					"norm_minmax_executions": 0
				},
				"rerank_by_field_executions": 0,
				"neural_query_enricher_executions": 0,
				"rerank_ml_executions": 0
			},
			"ingest": {
				"skip_existing_executions": 0,
				"text_chunking_fixed_token_length_executions": 0,
				"sparse_encoding_executions": 0,
				"text_chunking_fixed_char_length_executions": 0,
				"text_chunking_executions": 0,
				"text_embedding_executions": 0,
				"semantic_field_executions": 0,
				"semantic_field_chunking_executions": 0,
				"text_chunking_delimiter_executions": 0,
				"text_image_embedding_executions": 0
			}
		},
        "memory": {
            "sparse": {
                "sparse_memory_usage": 0.13,
                "clustered_posting_usage": 0.06,
                "forward_index_usage": 0.06
            }
        }
	},
	"nodes": {
		"_cONimhxS6KdedymRZr6xg": {
			"query": {
				"hybrid": {
					"hybrid_query_with_pagination_requests": 0,
					"hybrid_query_with_filter_requests": 0,
					"hybrid_query_with_inner_hits_requests": 0,
					"hybrid_query_requests": 0
				},
				"neural": {
					"neural_query_against_semantic_sparse_requests": 0,
					"neural_query_requests": 0,
					"neural_query_against_semantic_dense_requests": 0,
					"neural_query_against_knn_requests": 0
				},
				"neural_sparse": {
					"neural_sparse_query_requests": 0,
                    "seismic_query_requests": 0
				},
                "memory": {
                    "sparse": {
                        "sparse_memory_usage_percentage": 0,
                        "sparse_memory_usage": 0.13,
                        "clustered_posting_usage": 0.06,
                        "forward_index_usage": 0.06
                    }
                }
			},
			"semantic_highlighting": {
				"semantic_highlighting_request_count": 0,
				"semantic_highlighting_batch_request_count": 0
			},
			"processors": {
				"search": {
					"neural_sparse_two_phase_executions": 0,
					"hybrid": {
						"comb_harmonic_executions": 0,
						"norm_zscore_executions": 0,
						"comb_rrf_executions": 0,
						"norm_l2_executions": 0,
						"rank_based_normalization_processor_executions": 0,
						"comb_arithmetic_executions": 0,
						"normalization_processor_executions": 0,
						"comb_geometric_executions": 0,
						"norm_minmax_executions": 0
					},
					"rerank_by_field_executions": 0,
					"neural_query_enricher_executions": 0,
					"rerank_ml_executions": 0
				},
				"ingest": {
					"skip_existing_executions": 0,
					"text_chunking_fixed_token_length_executions": 0,
					"sparse_encoding_executions": 0,
					"text_chunking_fixed_char_length_executions": 0,
					"text_chunking_executions": 0,
					"text_embedding_executions": 0,
					"semantic_field_executions": 0,
					"semantic_field_chunking_executions": 0,
					"text_chunking_delimiter_executions": 0,
					"text_image_embedding_executions": 0
				}
			}
		}
	}
}

If include_metadata is true, each stats object contains additional metadata:

{
    ...,
    "text_embedding_executions": {
      "value": 0,
      "stat_type": "timestamped_event_counter",
      "trailing_interval_value": 0,
      "minutes_since_last_event": 29061801
    },
    ...
}

For more information, see Available metadata.

Response body fields

The following sections describe response body fields.

Categories of statistics

The following table lists all categories of statistics.

Category Data type Description
info Object Contains cluster-wide information and statistics that are not specific to individual nodes.
all_nodes Object Provides aggregated statistics across all nodes in the cluster.
nodes Object Contains node-specific statistics, with each node identified by its unique node ID.

Available statistics

The following table lists the available statistics. For statistics with paths prefixed with nodes.<node_id>, aggregate cluster-level statistics are also available at the same path prefixed with all_nodes.

Statistic name Category Statistic path within category Description
cluster_version info cluster_version The version of the cluster.

Info statistics: Processors

Statistic name Category Statistic path within category Description
text_embedding_processors_in_pipelines info processors.ingest.text_embedding_processors_in_pipelines The number of text_embedding processors in ingest pipelines.
sparse_encoding_processors info processors.ingest.sparse_encoding_processors The number of sparse_encoding processors in ingest pipelines.
skip_existing_processors info processors.ingest.skip_existing_processors The number of processors with skip_existing set to true in ingest pipelines.
text_image_embedding_processors info processors.ingest.text_image_embedding_processors The number of text_image_embedding processors in ingest pipelines.
text_chunking_delimiter_processors info processors.ingest.text_chunking_delimiter_processors The number of text_chunking processors using the delimiter algorithm in ingest pipelines.
text_chunking_fixed_token_length_processors info processors.ingest.text_chunking_fixed_token_length_processors The number of text_chunking processors using the fixed_token_length algorithm in ingest pipelines.
text_chunking_fixed_char_length_processors info processors.ingest.text_chunking_fixed_char_length_processors The number of text_chunking processors using the fixed_character_length algorithm in ingest pipelines.
text_chunking_processors info processors.ingest.text_chunking_processors The number of text_chunking processors in ingest pipelines.
rerank_ml_processors info processors.search.rerank_ml_processors The number of rerank processors of the ml_opensearch type in search pipelines.
rerank_by_field_processors info processors.search.rerank_by_field_processors The number of rerank processors of the by_field type.
neural_sparse_two_phase_processors info processors.search.neural_sparse_two_phase_processors The number of neural_sparse_two_phase_processor processors in search pipelines.
neural_query_enricher_processors info processors.search.neural_query_enricher_processors The number of neural_query_enricher processors in search pipelines.

Info statistics: Hybrid processors

Statistic name Category Statistic path within category Description
normalization_processors info processors.search.hybrid.normalization_processors The number of normalization-processor processors.
norm_minmax_processors info processors.search.hybrid.norm_minmax_processors The number of normalization-processor processors with normalization.technique set to min_max.
norm_l2_processors info processors.search.hybrid.norm_l2_processors The number of normalization-processor processors with normalization.technique set to l2.
norm_zscore_processors info processors.search.hybrid.norm_zscore_processors The number of normalization-processor processors with normalization.technique set to z_score.
comb_arithmetic_processors info processors.search.hybrid.comb_arithmetic_processors The number of normalization-processor processors with combination.technique set to arithmetic_mean.
comb_geometric_processors info processors.search.hybrid.comb_geometric_processors The number of normalization-processor processors with combination.technique set to geometric_mean.
comb_harmonic_processors info processors.search.hybrid.comb_harmonic_processors The number of normalization-processor processors with combination.technique set to harmonic_mean.
rank_based_normalization_processors info processors.search.hybrid.rank_based_normalization_processors The number of score-ranker-processor processors.
comb_rrf_processors info processors.search.hybrid.comb_rrf_processors The number of score-ranker-processor processors with combination.technique set to rrf.

Node-level statistics: Processors

Statistic name Category Statistic path within category Description
text_embedding_executions nodes, all_nodes processors.ingest.text_embedding_executions The number of text_embedding processor executions.
skip_existing_executions nodes, all_nodes processors.ingest.skip_existing_executions The number of processor executions that have skip_existing set to true.
text_chunking_fixed_token_length_executions nodes, all_nodes processors.ingest.text_chunking_fixed_token_length_executions The number of text_chunking processor executions with the fixed_token_length algorithm.
sparse_encoding_executions nodes, all_nodes processors.ingest.sparse_encoding_executions The number of sparse_encoding processor executions.
text_chunking_fixed_char_length_executions nodes, all_nodes processors.ingest.text_chunking_fixed_char_length_executions The number of text_chunking processor executions with the fixed_character_length algorithm.
text_chunking_executions nodes, all_nodes processors.ingest.text_chunking_executions The number of text_chunking processor executions.
semantic_field_executions nodes, all_nodes processors.ingest.semantic_field_executions The number of semantic field system processor executions.
semantic_field_chunking_executions nodes, all_nodes processors.ingest.semantic_field_chunking_executions The number of semantic field system chunking processor executions.
text_chunking_delimiter_executions nodes, all_nodes processors.ingest.text_chunking_delimiter_executions The number of text_chunking processor executions with the delimiter algorithm.
text_image_embedding_executions nodes, all_nodes processors.ingest.text_image_embedding_executions The number of text_image_embedding processor executions.
neural_sparse_two_phase_executions nodes, all_nodes processors.search.neural_sparse_two_phase_executions The number of neural_sparse_two_phase_processor processor executions.
rerank_by_field_executions nodes, all_nodes processors.search.rerank_by_field_executions The number of rerank processor executions of the by_field type.
neural_query_enricher_executions nodes, all_nodes processors.search.neural_query_enricher_executions The number of neural_query_enricher processor executions.
rerank_ml_executions nodes, all_nodes processors.search.rerank_ml_executions The number of rerank processor executions of the ml_opensearch type.

Node-level statistics: Hybrid processors

Statistic name Category Statistic path within category Description
normalization_processor_executions nodes, all_nodes processors.search.hybrid.normalization_processor_executions The number of normalization-processor processor executions.
rank_based_normalization_processor_executions nodes, all_nodes processors.search.hybrid.rank_based_normalization_processor_executions The number of score-ranker-processor processor executions.
comb_harmonic_executions nodes, all_nodes processors.search.hybrid.comb_harmonic_executions The number of normalization-processor processor executions with combination.technique set to harmonic_mean.
norm_zscore_executions nodes, all_nodes processors.search.hybrid.norm_zscore_executions The number of normalization-processor processor executions with normalization.technique set to z_score.
comb_rrf_executions nodes, all_nodes processors.search.hybrid.comb_rrf_executions The number of score-ranker-processor processor executions with combination.technique set to rrf.
norm_l2_executions nodes, all_nodes processors.search.hybrid.norm_l2_executions The number of normalization-processor processor executions with normalization.technique set to l2.
comb_arithmetic_executions nodes, all_nodes processors.search.hybrid.comb_arithmetic_executions The number of normalization-processor processor executions with combination.technique set to arithmetic_mean.
comb_geometric_executions nodes, all_nodes processors.search.hybrid.comb_geometric_executions The number of normalization-processor processor executions with combination.technique set to geometric_mean.
norm_minmax_executions nodes, all_nodes processors.search.hybrid.norm_minmax_executions The number of normalization-processor processor executions with normalization.technique set to min_max.

Node-level statistics: Query

Statistic name Category Statistic path within category Description
hybrid_query_with_pagination_requests nodes, all_nodes query.hybrid.hybrid_query_with_pagination_requests The number of hybrid query requests with pagination.
hybrid_query_with_filter_requests nodes, all_nodes query.hybrid.hybrid_query_with_filter_requests The number of hybrid query requests with filters.
hybrid_query_with_inner_hits_requests nodes, all_nodes query.hybrid.hybrid_query_with_inner_hits_requests The number of hybrid query requests with inner hits.
hybrid_query_requests nodes, all_nodes query.hybrid.hybrid_query_requests The total number of hybrid query requests.
neural_query_against_semantic_sparse_requests nodes, all_nodes query.neural.neural_query_against_semantic_sparse_requests The number of neural query requests against semantic sparse fields.
neural_query_requests nodes, all_nodes query.neural.neural_query_requests The total number of neural query requests.
neural_query_against_semantic_dense_requests nodes, all_nodes query.neural.neural_query_against_semantic_dense_requests The number of neural query requests against semantic dense fields.
neural_query_against_knn_requests nodes, all_nodes query.neural.neural_query_against_knn_requests The number of neural query requests against k-NN fields.
neural_sparse_query_requests nodes, all_nodes query.neural_sparse.neural_sparse_query_requests The number of neural_sparse query requests against rank_features fields (traditional neural sparse search).
seismic_query_requests nodes, all_nodes query.neural_sparse.seismic_query_requests The number of neural_sparse query requests against sparse_vector fields (neural sparse approximate nearest neighbor (ANN) search using the SEISMIC algorithm).

Node-level statistics: Memory

Statistic name Category Statistic path within category Description
sparse_memory_usage_percentage nodes memory.sparse.sparse_memory_usage_percentage The percentage of JVM heap memory used to store sparse data on the node relative to the maximum JVM memory.
sparse_memory_usage nodes, all_nodes memory.sparse.sparse_memory_usage The amount of JVM heap memory used to store sparse data on the node, in kilobytes.
clustered_posting_usage nodes, all_nodes memory.sparse.clustered_posting_usage The amount of JVM heap memory used to store clustered posting on the node, in kilobytes.
forward_index_usage nodes, all_nodes memory.sparse.forward_index_usage The amount of JVM heap memory used to store the forward index on the node, in kilobytes.

Node-level statistics: Semantic highlighting

Statistic name Category Statistic path within category Description
semantic_highlighting_request_count nodes, all_nodes semantic_highlighting.semantic_highlighting_request_count The number of single inference semantic highlighting requests (one inference call per document). See Single inference mode.
semantic_highlighting_batch_request_count nodes, all_nodes semantic_highlighting.semantic_highlighting_batch_request_count The number of batch inference semantic highlighting requests (multiple documents processed in a single inference call). See Batch inference mode.

Available metadata

When include_metadata is true, the field values in the response are replaced by their respective metadata objects, which include additional information about the statistic types, as described in the following table.

Statistic type Description
info_string A basic string value that provides informational content, such as versions or names. See info_string.
info_counter A numerical counter that represents static or slowly changing values. See info_counter.
timestamped_event_counter A counter that tracks events over time, including information about recent activity. See timestamped_event_counter.

The info_string object contains the following metadata fields.

Metadata field Data type Description
value String The actual string value of the statistic.
stat_type String Always set to info_string.

The info_counter object contains the following metadata fields.

Metadata field Data type Description
value Integer The current count value.
stat_type String Always set to info_counter.

The timestamped_event_counter object contains the following metadata fields.

Metadata field Data type Description
value Integer The total number of events that occurred since the node started.
stat_type String Always set to timestamped_event_counter.
trailing_interval_value Integer The number of events that occurred in the past 5 minutes.
minutes_since_last_event Integer The amount of time (in minutes) since the last recorded event.

Warm up

Introduced 3.3

Sparse indexes support neural sparse ANN search. To maximize search efficiency, OpenSearch caches sparse data in JVM memory.

To avoid high latency during initial searches, you can run random queries during a warmup period. After the warmup period, sparse data is stored in JVM memory, and you can start production workloads. However, this approach is indirect and requires additional effort.

As an alternative, you can use the warm up API operation to avoid latency during initial searches. This operation loads all sparse data for the primary and replica shards of the specified indexes into JVM memory. The warm up API operation is idempotent: if a segment’s sparse data is already loaded into memory, this operation has no effect. It only loads files not currently stored in memory.

This API operation only works with sparse indexes (indexes created with index.sparse set to true).

Endpoints

POST /_plugins/_neural/warmup/<index>

Path parameters

The following table lists the available path parameters.

Parameter Data type Description
<index> String An index name or names (comma-separated) to warm up. Supports wildcards (*). Required.

Example request

The following request performs a warm up operation on three indexes:

POST /_plugins/_neural/warmup/index1,index2,index3

You can use the warm up API operation with index patterns to clear one or more indexes that match a specified pattern from the cache:

POST /_plugins/_neural/warmup/index*

Example response

The API call returns results only after the warm up operation finishes or the request times out:

{
  "_shards" : {
    "total" : 6,
    "successful" : 6,
    "failed" : 0
  }
}

If the request times out, then the operation continues on the cluster.

To monitor the warm up operation, use the OpenSearch Tasks API:

GET /_tasks

After the operation has finished, use the neural stats API operation to monitor the updated memory usage.

Response body fields

The following table lists all response body fields.

Field Data type Description
_shards.total Integer The total number of shards that OpenSearch attempted to warm up.
_shards.successful Integer The number of shards that were successfully warmed up.
_shards.failed Integer The number of shards that failed to warm up.

Best practices

To ensure that the warm up operation works properly, follow these best practices:

  • Avoid running merge operations on indexes you plan to warm up: During a merge operation, OpenSearch creates new segments and may delete old ones. For example, if the warm up API operation loads sparse indexes A and B into native memory, but a merge creates a new segment C from A and B, then A and B are removed from memory but C is not yet loaded. In this case, the initial loading delay for sparse index C still occurs.

  • Verify that all sparse indexes you plan to warm up can fit into JVM memory. For more information about memory limits, see neural_search.circuit_breaker.limit.

Clear cache

Introduced 3.3

During neural sparse ANN search or warm up operations, sparse data is loaded into JVM memory. You can remove this data by deleting the corresponding index.

In contrast, decreasing the neural search circuit breaker limit does not immediately evict cached sparse data. To manually clear cached data, use the neural search clear cache API operation. This operation removes all in-memory sparse data for all shards (primaries and replicas) of the indexes specified in the request.

Similar to the warm up operation, the clear cache operation is idempotent: if you attempt to clear the cache for an index that has already been evicted, the operation has no additional effect.

This API operation only works with sparse indexes (indexes created with index.sparse set to true).

Endpoints

POST /_plugins/_neural/clear_cache/<index>

Path parameters

The following table lists the available path parameters.

Parameter Data type Description
<index> String An index name or names (comma-separated) for which to clear cache. Supports wildcards (*). Required.

Example request

The following request clears the sparse data of three specified indexes from JVM memory:

POST /_plugins/_neural/clear_cache/index1,index2,index3

You can also use index patterns to clear one or more indexes that match a pattern:

POST /_plugins/_neural/clear_cache/index*

Example response

The API call returns results only after the clear cache operation finishes or the request times out:

{
  "_shards" : {
    "total" : 6,
    "successful" : 6,
    "failed" : 0
  }
}

If the request times out, the operation continues running in the cluster.

To monitor the progress of the clear cache operation, use the OpenSearch Tasks API:

GET /_tasks

After the operation finishes, use the neural stats API operation to check updated memory usage.

Response body fields

The following table lists all response body fields.

Field Data type Description
_shards.total Integer The total number of shards for which OpenSearch attempted to clear cache.
_shards.successful Integer The number of shards for which cache was successfully cleared.
_shards.failed Integer The number of shards for which cache failed to clear.