You're viewing version 2.19 of the OpenSearch documentation. For the latest version, see the current documentation. For information about OpenSearch version maintenance, see Release Schedule and Maintenance Policy.

Median absolute deviation aggregations

The median_absolute_deviation aggregation is a single-value metric aggregation. Median absolute deviation is a variability metric that measures dispersion from the median.

Median absolute deviation is less affected by outliers than standard deviation, which relies on squared error terms and is useful for describing data that is not normally distributed.

Median absolute deviation is computed as follows:

median_absolute_deviation = median( | x<sub>i</sub> - median(x<sub>i</sub>) | )

OpenSearch estimates median_absolute_deviation, rather than calculating it directly, because of memory limitations. This estimation is computationally expensive. You can adjust the trade-off between estimation accuracy and performance. For more information, see Adjusting estimation accuracy.

Parameters

The median_absolute_deviation aggregation takes the following parameters.

Parameter	Required/Optional	Data type	Description
`field`	Required	String	The name of the numeric field for which the median absolute deviation is computed.
`missing`	Optional	Numeric	The value to assign to missing instances of the field. If not provided, documents with missing values are omitted from the estimation.
`compression`	Optional	Numeric	A parameter that adjusts the balance between estimate accuracy and performance. The value of `compression` must be greater than `0`. The default value is `1000`.

Example

The following example calculates the median absolute deviation of the DistanceMiles field in the opensearch_dashboards_sample_data_flights dataset:

GET opensearch_dashboards_sample_data_flights/_search
{
  "size": 0,
  "aggs": {
    "median_absolute_deviation_DistanceMiles": {
      "median_absolute_deviation": {
        "field": "DistanceMiles"
      }
    }
  }
}

Example response

As shown in the following example response, the aggregation returns an estimate of the median absolute deviation in the median_absolute_deviation_DistanceMiles variable:

{
  "took": 490,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "median_absolute_deviation_DistanceMiles": {
      "value": 1830.917892238693
    }
  }
}

Missing values

OpenSearch ignores missing and null values when computing median_absolute_deviation.

You can assign a value to missing instances of the aggregated field. See Missing aggregations for more information.

Adjusting estimation accuracy

The median absolute deviation is calculated using the t-digest data structure, which takes a compression parameter to balance performance and estimation accuracy. Lower values of compression improve performance but may reduce estimation accuracy, as shown in the following request:

GET opensearch_dashboards_sample_data_flights/_search
{
  "size": 0,
  "aggs": {
    "median_absolute_deviation_DistanceMiles": {
      "median_absolute_deviation": {
        "field": "DistanceMiles",
        "compression": 10
      }
    }
  }
}

The estimation error depends on the dataset but is usually below 5%, even for compression values as low as 100. (The low example value of 10 is used here to illustrate the trade-off effect and is not recommended.)

Note the decreased computation time (took time) and the slightly less accurate value of the estimated parameter in the following response.

For reference, OpenSearch’s best estimate (with compression set arbitrarily high) for the median absolute deviation of DistanceMiles is 1831.076904296875:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 10000,
      "relation": "gte"
    },
    "max_score": null,
    "hits": []
  },
  "aggregations": {
    "median_absolute_deviation_DistanceMiles": {
      "value": 1836.265614211182
    }
  }
}

Parameters
Example
Missing values
Adjusting estimation accuracy

WAS THIS PAGE HELPFUL?

✔ Yes ✖ No

Tell us why

350 characters left

Have a question? Ask us on the OpenSearch forum.

Want to contribute? Edit this page or create an issue.

Median absolute deviation aggregations

Parameters

Example

Example response

Missing values

Adjusting estimation accuracy

OpenSearch Links

Get Involved

Resources

Contact Us