You're viewing version 3.1 of the OpenSearch documentation. This version is no longer maintained. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Terms query

Use the terms query to search for multiple terms in the same field. For example, the following query searches for lines with the IDs 61809 and 61810:

GET shakespeare/_search
{
  "query": {
    "terms": {
      "line_id": [
        "61809",
        "61810"
      ]
    }
  }
}

A document is returned if it matches any of the terms in the array.

By default, the maximum number of terms allowed in a terms query is 65,536. To change the maximum number of terms, update the index.max_terms_count setting.

For better query performance, pass long arrays containing terms in sorted order (ordered by UTF-8 byte values, ascending).

The ability to highlight results for terms queries may not be guaranteed, depending on the highlighter type and the number of terms in the query.

Parameters

The query accepts the following parameters. All parameters are optional.

Parameter	Data type	Description
`<field>`	String	The field in which to search. A document is returned in the results only if its field value exactly matches at least one term, with the correct spacing and capitalization.
`boost`	Floating-point	A floating-point value that specifies the weight of this field toward the relevance score. Values above 1.0 increase the field’s relevance. Values between 0.0 and 1.0 decrease the field’s relevance. Default is 1.0.
`_name`	String	The name of the query for query tagging. Optional.
`value_type`	String	Specifies the types of values used for filtering. Valid values are `default` and `bitmap`. If omitted, the value defaults to `default`.

Terms lookup

Terms lookup retrieves the field values of a single document and uses them as search terms. You can use terms lookup to search for a large number of terms.

To use terms lookup, you must enable the _source mapping field because terms lookup fetches values from a document. The _source field is enabled by default.

Terms lookup tries to fetch the document field values from a shard on a local data node. Thus, using an index with a single primary shard that has full replicas on all applicable data nodes reduces network traffic.

Example

As an example, create an index that contains student data, mapping student_id as a keyword:

PUT students
{
  "mappings": {
    "properties": {
      "student_id": { "type": "keyword" }
    }
  }
}

Next, index three documents that correspond to students:

PUT students/_doc/1
{
  "name": "Jane Doe",
  "student_id" : "111"
}

PUT students/_doc/2
{
  "name": "Mary Major",
  "student_id" : "222"
}

PUT students/_doc/3
{
  "name": "John Doe",
  "student_id" : "333"
}

Create a separate index that contains class information, including the class name and an array of student IDs corresponding to the students enrolled in the class:

PUT classes/_doc/101
{
  "name": "CS101",
  "enrolled" : ["111" , "222"]
}

To search for students enrolled in the CS101 class, specify the document ID of the document that corresponds to the class, the index of that document, and the path of the field in which the terms are located:

GET students/_search
{
  "query": {
    "terms": {
      "student_id": {
        "index": "classes",
        "id": "101",
        "path": "enrolled"
      }
    }
  }
}

The response contains the documents in the students index for every student whose ID matches one of the values in the enrolled array:

{
  "took": 13,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "students",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "Jane Doe",
          "student_id": "111"
        }
      },
      {
        "_index": "students",
        "_id": "2",
        "_score": 1,
        "_source": {
          "name": "Mary Major",
          "student_id": "222"
        }
      }
    ]
  }
}

Example: Nested fields

The second example demonstrates querying nested fields. Consider an index with the following document:

PUT classes/_doc/102
{
  "name": "CS102",
  "enrolled_students" : {
    "id_list" : ["111" , "333"]
  }
}

To search for students enrolled in CS102, use the dot path notation to specify the full path to the field in the path parameter:

GET students/_search
{
  "query": {
    "terms": {
      "student_id": {
        "index": "classes",
        "id": "102",
        "path": "enrolled_students.id_list"
      }
    }
  }
}

The response contains the matching documents:

{
  "took": 18,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 2,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "students",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "Jane Doe",
          "student_id": "111"
        }
      },
      {
        "_index": "students",
        "_id": "3",
        "_score": 1,
        "_source": {
          "name": "John Doe",
          "student_id": "333"
        }
      }
    ]
  }
}

Parameters

The following table lists the terms lookup parameters.

Parameter	Data type	Description
`index`	String	The name of the index in which to fetch field values. Required.
`id`	String	The document ID of the document from which to fetch field values. Required.
`query`	Object	A query object used to select multiple documents from which to fetch field values. Required if `id` is not supplied.
`path`	String	The name of the field from which to fetch field values. Specify nested fields using dot path notation. Required.
`routing`	String	Custom routing value of the document from which to fetch field values. Optional. Required if a custom routing value was provided when the document was indexed.
`store`	Boolean	Whether to perform the lookup on the stored field instead of `_source`. Optional.

Terms lookup by query

Introduced 3.2

You can use a query to dynamically extract values from multiple documents and use them in a terms query. Instead of specifying a document ID, the query parameter lets you match documents and collect all values for a specified field across those matches.

This is useful when you want to search one index based on field values from documents in another index.

For a list of supported parameters, see terms lookup parameters. To use terms lookup by query, you must provide the query parameter instead of id in the terms lookup object.

How values are collected

The behavior of the terms lookup depends on how the target field appears in the matched documents:

If a document matching the query does not contain the specified field, the document is ignored for terms extraction.
If the field is a list, all its items are collected.
If the field is a scalar, its value is collected.
If the same field is a single value or a list for different documents, all values are flattened into a single list and deduplicated.
Multiple lists are flattened into a single list.
If the field is missing, null, or an empty list, it is skipped.
Duplicates from multiple documents are deduplicated.
If no documents match the query, the terms query acts as if no values were specified (typically, matches nothing).
If none of the matched documents contain the field, the query does not match anything.

Example

First, create an index named users, which contains user information:

PUT /users
{
  "mappings": {
    "properties": {
      "username": { "type": "keyword" }
    }
  }
}

Add user data to the index:

PUT users/_doc/u1
{ "username": "alice" }

PUT users/_doc/u2
{ "username": "bob" }

PUT users/_doc/u3
{ "username": "carol" }

PUT users/_doc/u4
{ "username": "dave" }

Next, create an index containing group memberships:

PUT groups
{
  "mappings": {
    "properties": {
      "group": { "type": "keyword" },
      "members": { "type": "keyword" }
    }
  }
}

Add group membership data to the index:

PUT groups/_doc/1
{
  "group": "g1",
  "members": ["alice", "bob"]
}

PUT groups/_doc/2
{
  "group": "g1",
  "members": "carol"
}

PUT groups/_doc/3
{
  "group": "g1"
}

PUT groups/_doc/4
{
  "group": "g1",
  "members": []
}

PUT groups/_doc/5
{
  "group": "g1",
  "members": null
}

PUT groups/_doc/6
{
  "group": "g2",
  "members": "carol"
}

To search the users index for all users who are members of the g1 group, use the following request:

GET /users/_search
{
  "query": {
    "terms": {
      "username": {
        "index": "groups",
        "path": "members",
        "query": {
          "term": { "group": "g1" }
        }
      }
    }
  }
}

This query collects all values from the members field of documents in groups whose group is set to g1 and uses them as terms for the username field in the users index:

{
  "hits": {
    "total": { "value": 3, "relation": "eq" },
    "hits": [
      { "_index": "users", "_id": "u1", "_score": 1.0, "_source": { "username": "alice" } },
      { "_index": "users", "_id": "u2", "_score": 1.0, "_source": { "username": "bob" } },
      { "_index": "users", "_id": "u3", "_score": 1.0, "_source": { "username": "carol" } }
    ]
  }
}

This query processes matching documents as follows:

The lookup query matches documents 1, 2, 3, 4, and 5 (all specify group g1).
Doc 6 (using a different group, g2) is ignored by the query.
The members field for each matching document is processed as follows:
- Doc 1: ["alice", "bob"] (list) → both alice and bob are collected.
- Doc 2: "carol" (scalar) → carol is collected.
- Doc 3: missing members field → ignored.
- Doc 4: empty list → ignored.
- Doc 5: null → ignored.
All collected values are flattened and deduplicated, so the final result is ["alice", "bob", "carol"].

Bitmap filtering

Introduced 2.17

The terms query can filter for multiple terms simultaneously. However, when the number of terms in the input filter increases to a large value (around 10,000), the resulting network and memory overhead can become significant, making the query inefficient. In such cases, consider encoding your large terms filter using a roaring bitmap for more efficient filtering.

The following example assumes that you have two indexes: a products index, which contains all the products sold by a company, and a customers index, which stores filters representing customers who own specific products.

First, create a products index and map product_id as an integer:

PUT /products
{
  "mappings": {
    "properties": {
      "product_id": { "type": "integer" }
    }
  }
}

Next, index three documents that correspond to products:

PUT /products/_doc/1
{
  "name": "Product 1",
  "product_id" : 111
}

PUT /products/_doc/2
{
  "name": "Product 2",
  "product_id" : 222
}

PUT /products/_doc/3
{
  "name": "Product 3",
  "product_id" : 333
}

To store customer bitmap filters, you’ll create a customer_filter binary field in the customers index. Specify store as true to store the field:

PUT /customers
{
  "mappings": {
    "properties": {
      "customer_filter": {
        "type": "binary",
        "store": true
      }
    }
  }
}

For each customer, you need to generate a bitmap that represents the product IDs of the products the customer owns. This bitmap effectively encodes the filter criteria for that customer. In this example, you’ll create a terms filter for a customer whose ID is customer123 and who owns products 111, 222, and 333.

To encode a terms filter for the customer, first create a roaring bitmap for the filter. This example creates a bitmap using the [PyRoaringBitMap] library, so first run pip install pyroaring to install the library. Then serialize the bitmap and encode it using a Base64 encoding scheme:

from pyroaring import BitMap
import base64

# Create a bitmap, serialize it into a byte string, and encode into Base64
bm = BitMap([111, 222, 333]) # product ids owned by a customer
encoded = base64.b64encode(BitMap.serialize(bm))

# Convert the Base64-encoded bytes to a string for storage or transmission
encoded_bm_str = encoded.decode('utf-8')

# Print the encoded bitmap
print(f"Encoded Bitmap: {encoded_bm_str}")

Next, index the customer filter into the customers index. The document ID for the filter is the same as the ID for the corresponding customer (in this example, customer123). The customer_filter field contains the bitmap you generated for this customer:

POST customers/_doc/customer123
{
  "customer_filter": "OjAAAAEAAAAAAAIAEAAAAG8A3gBNAQ=="
}

Now you can run a terms query on the products index to look up a specific customer in the customers index. Because you’re looking up a stored field instead of _source, set store to true. In the value_type field, specify the data type of the terms input as bitmap:

POST /products/_search
{
  "query": {
    "terms": {
      "product_id": {
        "index": "customers",
        "id": "customer123",
        "path": "customer_filter",
        "store": true               
      },
      "value_type": "bitmap"      
    }
  }
}

You can also directly pass the bitmap to the terms query. In this example, the product_id field contains the customer filter bitmap for the customer whose ID is customer123:

POST /products/_search
{
  "query": {
    "terms": {
      "product_id": [
        "OjAAAAEAAAAAAAIAEAAAAG8A3gBNAQ=="
      ],
      "value_type": "bitmap"
    }
  }
}

Parameters
Terms lookup
Terms lookup by query
- How values are collected
- Example
Bitmap filtering

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Terms query

Parameters

Terms lookup

Example

Example: Nested fields

Parameters

Terms lookup by query

How values are collected

Example

Bitmap filtering

OpenSearch Links

Get Involved

Resources

Contact Us