Link Search Menu Expand Document Documentation Menu

You're viewing version 2.19 of the OpenSearch documentation. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Hybrid search with post-filtering

Introduced 2.13

You can perform post-filtering on hybrid search results by providing the post_filter parameter in your query.

The post_filter clause is applied after the search results have been retrieved. Post-filtering is useful for applying additional filters to the search results without impacting the scoring or the order of the results.

Post-filtering does not impact document relevance scores or aggregation results.

Example

The following example request combines two query clauses—a term query and a match query—and contains a post_filter:

GET /my-nlp-index/_search?search_pipeline=nlp-search-pipeline
{
  "query": {
    "hybrid":{
      "queries":[
        {
          "match":{
            "passage_text": "hello"
          }
        },
        {
          "term":{
            "passage_text":{
              "value":"planet"
            }
          }
        }
      ]
    }

  },
  "post_filter":{
    "match": { "passage_text": "world" }
  }
}

Compare the results to the results in the example without post-filtering. In the example without post-filtering, the response contains two documents. In this example, the response contains one document because the second document is filtered out:

{
  "took": 18,
  "timed_out": false,
  "_shards": {
    "total": 2,
    "successful": 2,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 0.3,
    "hits": [
      {
        "_index": "my-nlp-index",
        "_id": "1",
        "_score": 0.3,
        "_source": {
          "id": "s1",
          "passage_text": "Hello world"
        }
      }
    ]
  }
}

How post-filtering affects search results and scoring

Post-filtering can significantly change the final search results and document scores. Consider the following scenarios.

Single-query scenario

Consider a query that returns the following results:

  • Query results before normalization: [d2: 5.0, d4: 3.0, d1: 2.0]
  • Normalized scores: [d2: 1.0, d4: 0.33, d1: 0.0]

After applying a post-filter to the initial query results, the results are as follows:

  • Post-filter matches [d2, d4]
  • Resulting scores: [d2: 1.0, d4: 0.0]

Note how document d4’s score changes from 0.33 to 0.0 after applying the post-filter.

Multiple-query scenario

Consider a query with two subqueries:

  • Query 1 results: [d2: 5.0, d4: 3.0, d1: 2.0]
  • Query 2 results: [d1: 1.0, d5: 0.5, d4: 0.25]
  • Normalized scores:
    • Query 1: [d2: 1.0, d4: 0.33, d1: 0.0]
    • Query 2: [d1: 1.0, d5: 0.33, d4: 0.0]
  • Combined initial scores: [d2: 1.0, d1: 0.5, d5: 0.33, d4: 0.165]

After applying a post-filter to the initial query results, the results are as follows:

  • Post-filter matches [d2, d4]
  • Resulting scores:
    • Query 1: [d2: 5.0, d4: 3.0]
    • Query 2: [d4: 0.25]
  • Normalized scores:
    • Query 1: [d2: 1.0, d4: 0.0]
    • Query 2: [d4: 1.0]
  • Combined final scores: [d2: 1.0, d4: 0.5]

Observe that:

  • Document d2’s score remains unchanged.
  • Document d4’s score has changed.
350 characters left

Have a question? .

Want to contribute? or .