You're viewing version 3.3 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.
Sparse vector
Introduced 3.3
The sparse_vector field supports neural sparse approximate nearest neighbor (ANN) search, which improves search efficiency while preserving relevance. A sparse_vector is stored as a map, in which each key represents the token and each value is a positive float value indicating the token’s weight.
Parameters
The sparse_vector field type supports the following parameters.
| Parameter | Type | Required | Description | Default | Range | |||
|---|---|---|---|---|---|---|---|---|
name | String | Yes | The neural sparse ANN search algorithm. Valid value is seismic. | - | - | |||
n_postings | Integer | No | The maximum number of documents to retain in each posting list. | 0.0005 * doc_count¹ | (0, ∞) | |||
cluster_ratio | Float | No | The fraction of documents in each posting list used to determine the cluster count. | 0.1 | (0, 1) | |||
summary_prune_ratio | Float | No | The fraction of total token weight to retain when pruning cluster summary vectors. For example, if summary_prune_ratio is set to 0.5, tokens contributing to the top 50% of the total weight are kept. Thus, for a cluster summary {"100": 1, "200": 2, "300": 3, "400": 6}, the pruned summary is {"400": 6}. | 0.4 | (0, 1] | 0.4 | (0, 1] | |
approximate_threshold | Integer | No | The minimum number of documents in a segment required to activate neural sparse ANN search. | 1000000 | [0, ∞) | |||
quantization_ceiling_search | Float | No | The maximum token weight used for quantization during search. | 16 | (0, ∞) | |||
quantization_ceiling_ingest | Float | No | The maximum token weight used for quantization during ingestion. | 3 | (0, ∞) |
¹doc_count represents the number of documents within the segment.
For parameter configuration, see Neural sparse ANN search.
To increase search efficiency and reduce memory consumption, the sparse_vector field automatically performs quantization of the token weight. You can adjust the quantization_ceiling_search and quantization_ceiling_ingest parameters according to different token weight distributions. For doc-only queries, we recommend setting quantization_ceiling_search to the default value (16). For bi-encoder queries, we recommend setting quantization_ceiling_search to 3. For more information about doc-only and bi-encoder query modes, see Generating sparse vector embeddings automatically.
Example
The following example demonstrates using a sparse_vector field type.
Step 1: Create an index
Create a sparse index by setting index.sparse to true and define a sparse_vector field in the index mapping:
PUT sparse-vector-index
{
"settings": {
"index": {
"sparse": true
},
"mappings": {
"properties": {
"sparse_embedding": {
"type": "sparse_vector",
"method": {
"name": "seismic",
"parameters": {
"n_postings": 300,
"cluster_ratio": 0.1,
"summary_prune_ratio": 0.4,
"approximate_threshold": 1000000
}
}
}
}
}
}
}
Step 2: Ingest data into the index
Ingest three documents containing sparse_vector fields into your index:
PUT sparse-vector-index/_doc/1
{
"sparse_embedding" : {
"1000": 0.1
}
}
PUT sparse-vector-index/_doc/2
{
"sparse_embedding" : {
"2000": 0.2
}
}
PUT sparse-vector-index/_doc/3
{
"sparse_embedding" : {
"3000": 0.3
}
}
Step 3: Search the index
You can query the sparse index by providing either raw vectors or natural language using a neural_sparse query.
Query using a raw vector
To query using a raw vector, provide the query_tokens parameter:
GET sparse-vector-index/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_tokens": {
"1055": 5.5
},
"method_parameters": {
"heap_factor": 1.0,
"top_n": 10,
"k": 10
}
}
}
}
}
Query using natural language
To query using natural language, provide the query_text and model_id parameters:
GET sparse-vector-index/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_text": "<input text>",
"model_id": "<model ID>",
"method_parameters": {
"k": 10,
"top_n": 10,
"heap_factor": 1.0
}
}
}
}
}