elasticsearch/docs/reference/esql/functions/examples/median_absolute_deviation.asciidoc
Iván Cea Fontenla fc2760cfd4
ESQL: mv_median_absolute_deviation function (#112055)
- Added mv_median_absolute_deviation function
- Added possibility of having a fixed param in Multivalue "ascending" functions
- Add surrogate to MedianAbsoluteDeviation

### Calculations used to avoid overflows
First, a quick recap of how the MAD is calculated:
1. Sort values, and get the median
2. Calculate the difference between each value with the median (`abs(median - value)`)
3. Sort the differences, and get their median

Calculating a MAD may overflow when calculating the differences (Step 2), given the type is a signed number, as the difference is a positive value, with potentially the same value as `POSITIVE_MAX - NEGATIVE_MIN`.
To solve this, some types are up-casted as follow:
- Int: Stored as longs, simple approach
- Long: Stored as longs, but switched to unsigned long representation when calculating the differences
- Unsigned long: No effect; the resulting range is the same
- Doubles: Nothing. If the values overflow to +/-infinity, they're left that way, as we'll just use those outliers to sort

Closes https://github.com/elastic/elasticsearch/issues/111590
2024-09-09 10:04:25 +02:00

22 lines
963 B
Text
Generated

// This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.
*Examples*
[source.merge.styled,esql]
----
include::{esql-specs}/median_absolute_deviation.csv-spec[tag=median-absolute-deviation]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/median_absolute_deviation.csv-spec[tag=median-absolute-deviation-result]
|===
The expression can use inline functions. For example, to calculate the the median absolute deviation of the maximum values of a multivalued column, first use `MV_MAX` to get the maximum value per row, and use the result with the `MEDIAN_ABSOLUTE_DEVIATION` function
[source.merge.styled,esql]
----
include::{esql-specs}/median_absolute_deviation.csv-spec[tag=docsStatsMADNestedExpression]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/median_absolute_deviation.csv-spec[tag=docsStatsMADNestedExpression-result]
|===