Explain API

Introduced 1.0

The Explain API returns detailed information about why a specific document matches or does not match a query. This API helps you understand how OpenSearch calculates the relevance score (_score) for each search result, making it an essential tool for debugging search relevance issues and optimizing queries.

OpenSearch uses a probabilistic ranking framework called Okapi BM25 to calculate relevance scores. Okapi BM25 is based on the original term frequency/inverse document frequency (TF/IDF) framework used by Apache Lucene.

Using the Explain API is expensive in terms of both resources and time. For production clusters, we recommend using it sparingly for the purpose of troubleshooting.

Limitations

The Explain API does not support the search_pipeline parameter. Sending a request with an inline or named search pipeline results in a parsing_exception error. This applies to all query types.

To use search pipeline processors with explain, provide the explain=true query parameter to the Search API. For more information about using explain with hybrid queries, see Hybrid search explain.

Endpoints

GET  /{index}/_explain/{id}
POST /{index}/_explain/{id}

Path parameters

The following table lists the available path parameters.

Parameter	Required	Data type	Description
`id`	Required	String	The document ID.
`index`	Required	String	The index to search. Only a single index name can be specified.

Query parameters

You must specify the index and document ID. All other parameters are optional.

Parameter	Type	Description	Required
`analyzer`	String	The analyzer to use for the `q` query string. Only valid when `q` is used.	No
`analyze_wildcard`	Boolean	Whether to analyze wildcard and prefix queries in the `q` string. Only valid when `q` is used. Default is `false`.	No
`default_operator`	String	The default Boolean operator (`AND` or `OR`) for the `q` query string. Only valid when `q` is used. Default is `OR`.	No
`df`	String	The default field to search if no field is specified in the `q` string. Only valid when `q` is used.	No
`lenient`	Boolean	Specifies whether OpenSearch should ignore format-based query failures (for example, querying a text field for an integer). Default is `false`.	No
`preference`	String	Specifies a preference of which shard to retrieve results from. Available options are `_local`, which tells the operation to retrieve results from a locally allocated shard replica, and a custom string value assigned to a specific shard replica. By default, OpenSearch executes the explain operation on random shards.	No
`q`	String	A query string in Lucene syntax. When used, you can configure query behavior using the `analyzer`, `analyze_wildcard`, `default_operator`, `df`, and `stored_fields` parameters.	No
`stored_fields`	String	A comma-separated list of stored fields to return. If omitted, only `_source` is returned.	No
`routing`	String	A value used to route the operation to a specific shard.	No
`_source`	String	Whether to include the `_source` field in the response body. Valid values are `true` (include full `_source`), `false` (exclude `_source`), or a comma-separated list of source fields to include in the query response. By default, when not specified, the `_source` field is not returned in the Explain API response.	No
`_source_excludes`	String	A comma-separated list of source fields to exclude in the query response.	No
`_source_includes`	String	A comma-separated list of source fields to include in the query response.	No

Request body fields

The request body contains the query to explain against the specified document. The following table lists the available request body fields.

Field	Data type	Description
`query`	Object	The query to execute against the document. Uses the same query syntax as the Search API. For more information about query types, see Query DSL.

Example: Explaining a match query

The following example explains why document 1 in the products index matches a query for the term “computer” in the name field:

POST /products/_explain/1
{
  "query": {
    "match": {
      "name": "computer"
    }
  }
}

response = client.explain(
  id = "1",
  index = "products",
  body =   {
    "query": {
      "match": {
        "name": "computer"
      }
    }
  }
)

Example response

The Explain API returns a response that includes whether the document matched the query and a detailed breakdown of the score calculation. The explanation includes the BM25 scoring formula components: term frequency (tf), inverse document frequency (idf), and field length normalization:

Response for explaining a match query

{
  "_index": "products",
  "_id": "1",
  "matched": true,
  "explanation": {
    "value": 0.31506687,
    "description": "weight(name:computer in 0) [PerFieldSimilarity], result of:",
    "details": [
      {
        "value": 0.31506687,
        "description": "score(freq=1.0), computed as boost * idf * tf from:",
        "details": [
          {
            "value": 0.6931472,
            "description": "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
            "details": [
              {
                "value": 2,
                "description": "n, number of documents containing term",
                "details": []
              },
              {
                "value": 4,
                "description": "N, total number of documents with field",
                "details": []
              }
            ]
          },
          {
            "value": 0.45454544,
            "description": "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
            "details": [
              {
                "value": 1.0,
                "description": "freq, occurrences of term within document",
                "details": []
              },
              {
                "value": 1.2,
                "description": "k1, term saturation parameter",
                "details": []
              },
              {
                "value": 0.75,
                "description": "b, length normalization parameter",
                "details": []
              },
              {
                "value": 2.0,
                "description": "dl, length of field",
                "details": []
              },
              {
                "value": 2.0,
                "description": "avgdl, average length of field",
                "details": []
              }
            ]
          }
        ]
      }
    ]
  }
}

Example: Using the query string parameter

You can use the q parameter to specify a query using Lucene query string syntax instead of providing a request body. The following example explains why document 1 matches the query string “name:laptop”:

GET /products/_explain/1?q=name:laptop

response = client.explain(
  id = "1",
  index = "products",
  params = { "q": "name:laptop" },
  body = { "Insert body here" }
)

Response body fields

The response contains detailed information about whether the document matched the query and how the relevance score was calculated. The following table lists the response body fields.

Field	Data type	Description
`_index`	String	The name of the index containing the document.
`_id`	String	The document ID.
`matched`	Boolean	Whether the document matches the query. If `true`, the document is a match and has a relevance score. If `false`, the document does not match the query.
`explanation`	Object	The explanation of the score calculation. Contains the following nested fields: `value` (the calculated score or score component), `description` (a human-readable description of the calculation), and `details` (an array of sub-explanations for nested calculations).
`explanation.value`	Float	The numeric result of the score calculation. For the top-level explanation, this is the final relevance score. For nested explanations, this represents intermediate calculation values.
`explanation.description`	String	A description of the calculation being performed. For BM25 scoring, this typically describes the scoring formula components, such as term frequency (`tf`), inverse document frequency (`idf`), and normalization factors.
`explanation.details`	Array of objects	An array of nested explanation objects that break down the score calculation into component parts. Each detail object has the same structure as the explanation object (with `value`, `description`, and `details` fields).
`get`	Object	The document metadata and source data. Only included when the `_source` or `stored_fields` parameters are used. Contains fields such as `_seq_no`, `_primary_term`, `found`, and `_source`.
`get._source`	Object	The document’s original JSON content. Only included when the `_source` parameter is specified.

BM25 scoring components

The explanation details include the following BM25 scoring components.

Field	Description
`idf`	The inverse document frequency. Measures how rare or common a term is across all documents in the index. Calculated as `log(1 + (N - n + 0.5) / (n + 0.5))`, where `N` is the total number of documents with the field and `n` is the number of documents containing the term. Rarer terms have higher IDF values and contribute more to the relevance score.
`tf`	The term frequency. Measures how often the term appears in the document field. Calculated as `freq / (freq + k1 * (1 - b + b * dl / avgdl))`, where `freq` is the number of times the term appears, `k1` is the term saturation parameter (default 1.2), `b` is the length normalization parameter (default 0.75), `dl` is the field length, and `avgdl` is the average field length across all documents. More frequent terms contribute more to the relevance score, but with diminishing returns.
`k1`	The term saturation parameter. Controls how quickly the score increases as term frequency increases. The default value is 1.2. Lower values cause the score to saturate more quickly, while higher values allow term frequency to have a greater impact on the score.
`b`	The length normalization parameter. Controls how much field length affects the score. The default value is 0.75. A value of 0 disables length normalization, while a value of 1 fully normalizes by field length. Shorter fields with matching terms typically receive higher scores.
`dl`	The document field length. The number of tokens in the field for this specific document.
`avgdl`	The average document field length. The average number of tokens in the field across all documents in the index.
`boost`	The query boost value. A multiplier applied to the score. The default boost is 1.0 when not explicitly specified in the query.

The final relevance score is calculated by multiplying these components together: score = boost * idf * tf. The values are calculated and stored at index time when a document is added or updated and may have small inaccuracies based on shard-level statistics.

Required permissions

If you use the Security plugin, make sure you have the appropriate permissions: indices:data/read/explain.

Limitations
Endpoints
Path parameters
Query parameters
Request body fields
Example: Explaining a match query
Example response
Example: Using the query string parameter
Response body fields
BM25 scoring components
Required permissions

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.