Link Search Menu Expand Document Documentation Menu

Weighted average

The weighted_avg aggregation calculates the weighted average of numeric values across documents. This is useful when you want to calculate an average but weight some data points more heavily than others.

The weighted average is calculated using the formula \(\frac{\sum_{i=1}^n \text{value}_i \cdot \text{weight}_i}{\sum_{i=1}^n \text{weight}_i}\).

Parameters

The weighted_avg aggregation takes the following parameters.

Parameter Required/Optional Description
value Required Defines how to obtain the numeric values to average. Requires a field or script.
weight Required Defines how to obtain the weight for each value. Requires a field or script.
format Optional A DecimalFormat formatting string. Returns the formatted output in the aggregation’s value_as_string property.
value_type Optional A type hint for the values when using scripts or unmapped fields.

You can specify the following parameters within value or weight.

Parameter Required/Optional Description
field Optional The document field to use for the value or weight.
missing Optional A default value or weight to use when the field is missing. See Missing values.

Example

First, create an index and index some data. Notice that Product C is missing the rating and num_reviews fields:

POST _bulk
{ "index": { "_index": "products" } }
{ "name": "Product A", "rating": 4.5, "num_reviews": 100 }
{ "index": { "_index": "products" } }
{ "name": "Product B", "rating": 3.8, "num_reviews": 50 }
{ "index": { "_index": "products" } }
{ "name": "Product C"}

The following request uses the weighted_avg aggregation to calculate a weighted average product rating. In this context, each product’s rating is weighted by its num_reviews. This means that products with more reviews will have a greater influence on the final average than those with fewer reviews:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "weighted_rating": {
      "weighted_avg": {
        "value": {
          "field": "rating"
        },
        "weight": {
          "field": "num_reviews"
        },
        "format": "#.##"
      }
    }
  }
}

Example response

The response contains the weighted_rating, calculated as weighted_avg = (4.5 * 100 + 3.8 * 50) / (100 + 50) = 4.27. Only documents 1 and 2, which contain values for both rating and num_reviews, are considered:

{
  "took": 18,
  "timed_out": false,
  "terminated_early": true,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "weighted_rating": {
      "value": 4.266666650772095,
      "value_as_string": "4.27"
    }
  }
}

Missing values

The missing parameter allows you to specify default values for documents missing the value field or the weight field instead of excluding them from the calculation.

For example, you can assign products without ratings an “average” rating of 3.0 and set the num_reviews to 1 to give them a small non-zero weight:

GET /products/_search
{
  "size": 0,
  "aggs": {
    "weighted_rating": {
      "weighted_avg": {
        "value": {
          "field": "rating",
          "missing": 3.0
        },
        "weight": {
          "field": "num_reviews",
          "missing": 1
        },
        "format": "#.##"
      }
    }
  }
}

The new weighted average is calculated as weighted_avg = (4.5 * 100 + 3.8 * 50 + 3.0 * 1) / (100 + 50 + 1) = 4.26:

{
  "took": 27,
  "timed_out": false,
  "terminated_early": true,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 3,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "weighted_rating": {
      "value": 4.258278129906055,
      "value_as_string": "4.26"
    }
  }
}
350 characters left

Have a question? .

Want to contribute? or .