Link Search Menu Expand Document Documentation Menu

Optimizing hybrid search

This is an experimental feature and is not recommended for use in a production environment. For updates on the progress of the feature or if you want to leave feedback, see the associated GitHub issue.

A key challenge of using hybrid search in OpenSearch is combining results from lexical and vector-based search effectively. OpenSearch provides different techniques and various parameters you can experiment with to find the best setup for your application. What works best, however, depends heavily on your data, user behavior, and application domain—there is no one-size-fits-all solution.

Search Relevance Workbench helps you systematically find the ideal set of parameters for your needs.

Requirements

Internally, optimizing hybrid search involves running multiple search quality evaluation experiments. For these experiments, you need a query set, judgments, and a search configuration. Search Relevance Workbench currently supports hybrid search optimization with exactly two query clauses. While hybrid search typically combines vector and lexical queries, you can run hybrid search optimization with two lexical query clauses:

PUT _plugins/_search_relevance/search_configurations
{
  "name": "hybrid_query_lexical",
  "query": "{\"query\":{\"hybrid\":{\"queries\":[{\"match\":{\"title\":\"%SearchText%\"}},{\"match\":{\"category\":\"%SearchText%\"}}]}}}",
  "index": "ecommerce"
}

Hybrid search optimization is most valuable when combining lexical and vector-based search results. For optimal results, configure your hybrid search query with two clauses: one textual query clause and one neural query clause. You don’t need to configure the search pipeline to combine results because the hybrid search optimization process handles this automatically. The following is an example of a search configuration suitable for hybrid search optimization:

PUT _plugins/_search_relevance/search_configurations
{
  "name": "hybrid_query_text",
  "query": "{\"query\":{\"hybrid\":{\"queries\":[{\"multi_match\":{\"query\":\"%SearchText%\",\"fields\":[\"id\",\"title\",\"category\",\"bullets\",\"description\",\"attrs.Brand\\\",\"attrs.Color\"]}},{\"neural\":{\"title_embedding\":{\"query_text\":\"%SearchText%\",\"k\":100,\"model_id\":\"lRFFb5cBHkapxdNcFFkP\"}}}]}},\"size\":10}",
  "index": "ecommerce"
}

The model ID specified in the query must be a valid model ID for a model deployed in OpenSearch. The target index must contain the field used for neural search embeddings (in this example, title_embedding).

For an end-to-end example, see the search-relevance repository.

Running a hybrid search optimization experiment

You can create a hybrid search optimization experiment by calling the Search Relevance Workbench experiments endpoint.

Endpoint

PUT _plugins/_search_relevance/experiments

Example request

PUT _plugins/_search_relevance/experiments
{
  "querySetId": "b16a6a2b-ed6e-49af-bb2b-fc739dcf24e6",
  "searchConfigurationList": ["508a8812-27c9-45fc-999a-05f859f9b210"],
  "judgmentList": ["1b944d40-e95a-43f6-9e92-9ce00f70de79"],
  "size": 10,
  "type": "HYBRID_OPTIMIZER"
}

Example response

{
  "experiment_id": "0f4eff05-fd14-4e85-ab5e-e8e484cdac73",
  "experiment_result": "CREATED"
}

Experimentation process

The hybrid search optimization experiment runs different evaluations based on the search configuration. The following parameters and parameter values are taken into account:

  • Two normalization techniques: l2 and min_max.
  • Three combination techniques: arithmetic_mean, harmonic_mean, geometric_mean.
  • The lexical and neural search weights, which are values ranging from 0.0 to 1.0 in 0.1 increments.

Every query in the query set is executed for all different parameter combinations, and the results are evaluated by using the judgment list.

Evaluating the results

The results for each evaluation are stored. You can view the results in OpenSearch Dashboards by selecting the corresponding experiment in the overview of past experiments, as shown in the following image.

Compare search results

All executed queries and their calculated search metrics are displayed, as shown in the following image.

Compare search results

To view query variants, select one of the queries, as shown in the following image.

Compare search results

You can also retrieve this information by using the following SQL search statement and providing your experimentId:

POST _plugins/_sql
{
  "query": "SELECT ev.parameters.normalization, ev.parameters.combination, ev.parameters.weights, ev.results.evaluationResultId, ev.experimentId, er.id, er.metrics, er.searchText FROM search-relevance-experiment-variant ev JOIN search-relevance-evaluation-result er ON ev.results.evaluationResultId = er.id WHERE ev.experimentId = '814e2378-901c-4273-9873-9b758a33089d'"
}

350 characters left

Have a question? .

Want to contribute? or .