Link Search Menu Expand Document Documentation Menu

Search analyzers

Search analyzers are specified at query time and are used to analyze the query string when you run a full-text query on a text field.

Determining which search analyzer to use

To determine which analyzer to use for a query string at query time, OpenSearch examines the following parameters in order:

  1. The analyzer parameter of the query
  2. The search_analyzer mapping parameter of the field
  3. The analysis.analyzer.default_search index setting
  4. The analyzer mapping parameter of the field
  5. The standard analyzer (default)

In most cases, specifying a search analyzer that is different from the index analyzer is not necessary and could negatively impact search result relevance or lead to unexpected search results.

Specifying a search analyzer at query time

You can override the default analyzer behavior by explicitly setting the analyzer in the query. The following query uses the english analyzer to stem the input terms:

GET /shakespeare/_search
{
  "query": {
    "match": {
      "text_entry": {
        "query": "speak the truth",
        "analyzer": "english"
      }
    }
  }
}

Specifying a search analyzer in the mappings

When defining mappings, you can provide both the analyzer (used at index time) and search_analyzer (used at query time) for any text field.

The following configuration allows different tokenization strategies for indexing and querying:

PUT /testindex
{
  "mappings": {
    "properties": {
      "text_entry": {
        "type": "text",
        "analyzer": "simple",
        "search_analyzer": "whitespace"
      }
    }
  }
}

The following configuration enables autocomplete-like behavior, where you can type the beginning of a word and still receive relevant matches:

PUT /articles
{
  "settings": {
    "analysis": {
      "analyzer": {
        "edge_ngram_analyzer": {
          "tokenizer": "edge_ngram_tokenizer",
          "filter": ["lowercase"]
        }
      },
      "tokenizer": {
        "edge_ngram_tokenizer": {
          "type": "edge_ngram",
          "min_gram": 2,
          "max_gram": 10,
          "token_chars": ["letter", "digit"]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "title": {
        "type": "text",
        "analyzer": "edge_ngram_analyzer",
        "search_analyzer": "standard"
      }
    }
  }
}

The edge_ngram_analyzer is applied at index time, breaking input strings into partial prefixes (n-grams), which allows the index to store fragments like “se”, “sea”, “sear”, and so on. Use the following request to index a document:

PUT /articles/_doc/1
{
  "title": "Search Analyzer in Action"
}

Use the following request to search for the partial word sear in the title field:

POST /articles/_search
{
  "query": {
    "match": {
      "title": "sear"
    }
  }
}

The response demonstrates that the query containing “sear” matches the document “Search Analyzer in Action” because the n-gram tokens generated at index time include that prefix. This mirrors the autocomplete functionality, in which typing a prefix can retrieve full matches:

{
  ...
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "articles",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "title": "Search Analyzer in Action"
        }
      }
    ]
  }
}

Setting a default search analyzer for an index

Specify analysis.analyzer.default_search to define a search analyzer for all fields unless overridden:

PUT /testindex
{
  "settings": {
    "analysis": {
      "analyzer": {
        "default": {
          "type": "simple"
        },
        "default_search": {
          "type": "whitespace"
        }
      }
    }
  }
}

This configuration ensures consistent behavior across multiple fields, especially when using custom analyzers.

For more information about supported analyzers, see Analyzers.