Sparse vector
Introduced 3.3
The sparse_vector
field supports neural sparse approximate nearest neighbor (ANN) search, which improves search efficiency while preserving relevance. A sparse_vector
is stored as a map, in which each key represents the token and each value is a positive float
value indicating the token’s weight.
Parameters
The sparse_vector
field type supports the following parameters.
Parameter | Type | Required | Description | Default | Range | |||
---|---|---|---|---|---|---|---|---|
name | String | Yes | The neural sparse ANN search algorithm. Valid value is seismic . | - | - | |||
n_postings | Integer | No | The maximum number of documents to retain in each posting list. | 0.0005 * doc_count ¹ | (0, ∞) | |||
cluster_ratio | Float | No | The fraction of documents in each posting list used to determine the cluster count. | 0.1 | (0, 1) | |||
summary_prune_ratio | Float | No | The fraction of total token weight to retain when pruning cluster summary vectors. For example, if summary_prune_ratio is set to 0.5 , tokens contributing to the top 50% of the total weight are kept. Thus, for a cluster summary {"100": 1, "200": 2, "300": 3, "400": 6} , the pruned summary is {"400": 6} . | 0.4 | (0, 1] | 0.4 | (0, 1] | |
approximate_threshold | Integer | No | The minimum number of documents in a segment required to activate neural sparse ANN search. | 1000000 | [0, ∞) | |||
quantization_ceiling_search | Float | No | The maximum token weight used for quantization during search. | 16 | (0, ∞) | |||
quantization_ceiling_ingest | Float | No | The maximum token weight used for quantization during ingestion. | 3 | (0, ∞) |
¹doc_count
represents the number of documents within the segment.
For parameter configuration, see Neural sparse ANN search.
To increase search efficiency and reduce memory consumption, the sparse_vector
field automatically performs quantization of the token weight. You can adjust the quantization_ceiling_search
and quantization_ceiling_ingest
parameters according to different token weight distributions. For doc-only queries, we recommend setting quantization_ceiling_search
to the default value (16
). For bi-encoder queries, we recommend setting quantization_ceiling_search
to 3
. For more information about doc-only and bi-encoder query modes, see Generating sparse vector embeddings automatically.
Example
The following example demonstrates using a sparse_vector
field type.
Step 1: Create an index
Create a sparse index by setting index.sparse
to true
and define a sparse_vector
field in the index mapping:
PUT sparse-vector-index
{
"settings": {
"index": {
"sparse": true
},
"mappings": {
"properties": {
"sparse_embedding": {
"type": "sparse_vector",
"method": {
"name": "seismic",
"parameters": {
"n_postings": 300,
"cluster_ratio": 0.1,
"summary_prune_ratio": 0.4,
"approximate_threshold": 1000000
}
}
}
}
}
}
}
Step 2: Ingest data into the index
Ingest three documents containing sparse_vector
fields into your index:
PUT sparse-vector-index/_doc/1
{
"sparse_embedding" : {
"1000": 0.1
}
}
PUT sparse-vector-index/_doc/2
{
"sparse_embedding" : {
"2000": 0.2
}
}
PUT sparse-vector-index/_doc/3
{
"sparse_embedding" : {
"3000": 0.3
}
}
Step 3: Search the index
You can query the sparse index by providing either raw vectors or natural language using a neural_sparse
query.
Query using a raw vector
To query using a raw vector, provide the query_tokens
parameter:
GET sparse-vector-index/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_tokens": {
"1055": 5.5
},
"method_parameters": {
"heap_factor": 1.0,
"top_n": 10,
"k": 10
}
}
}
}
}
Query using natural language
To query using natural language, provide the query_text
and model_id
parameters:
GET sparse-vector-index/_search
{
"query": {
"neural_sparse": {
"sparse_embedding": {
"query_text": "<input text>",
"model_id": "<model ID>",
"method_parameters": {
"k": 10,
"top_n": 10,
"heap_factor": 1.0
}
}
}
}
}