elasticsearch/docs/reference/aggregations/pipeline
Benjamin Trent 30cf4dc8be
[ML] adding new KS test pipeline aggregation (#73334)
This adds a new pipeline aggregation for calculating Kolmogorov–Smirnov test for a given sample and buckets path.

For now, the buckets path resolution needs to be `_count`. But, this may be relaxed in the future. 

It accepts a parameter `fractions` that indicates the distribution of documents from some other pre-calculated sample. 

This particular version of the K-S test is Two-sample, meaning, it calculates if the `fractions` and the distribution of `_count` values in the buckets_path are taken from the same distribution.

This in combination with the hypothesis alternatives (`less`, `greater`, `two_sided`) and sampling logic (`upper_tail`, `lower_tail`, `uniform`) allow for flexibility and usefulness when comparing two samples and determining the likelihood of them being from the same overall distribution.

Usage:

```
POST correlate_latency/_search?size=0&filter_path=aggregations
{
  "aggs": {
    "buckets": {
      "terms": { <1>
        "field": "version",
        "size": 2
      },
      "aggs": {
        "latency_ranges": {
          "range": { <2>
            "field": "latency",
            "ranges": [
              { "to": 0.0 },
              { "from": 0, "to": 105 },
              { "from": 105, "to": 225 },
              { "from": 225, "to": 445 },
              { "from": 445, "to": 665 },
              { "from": 665, "to": 885 },
              { "from": 885, "to": 1115 },
              { "from": 1115, "to": 1335 },
              { "from": 1335, "to": 1555 },
              { "from": 1555, "to": 1775 },
              { "from": 1775 }
            ]
          }
        },
        "ks_test": { <3>
          "bucket_count_ks_test": {
            "buckets_path": "latency_ranges>_count",
            "alternative": ["less", "greater", "two_sided"]
          }
        }
      }
    }
  }
}
```
2021-06-04 10:04:41 -04:00
..
avg-bucket-aggregation.asciidoc [DOCS] Fix gap policy xref 2021-03-03 09:31:02 -05:00
bucket-correlation-aggregation.asciidoc [ML] adding new KS test pipeline aggregation (#73334) 2021-06-04 10:04:41 -04:00
bucket-count-ks-test-aggregation.asciidoc [ML] adding new KS test pipeline aggregation (#73334) 2021-06-04 10:04:41 -04:00
bucket-script-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
bucket-selector-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
bucket-sort-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
cumulative-cardinality-aggregation.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
cumulative-sum-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
derivative-aggregation.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
extended-stats-bucket-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
inference-bucket-aggregation.asciidoc [DOCS] Removes beta labels from DFA related docs. (#70808) 2021-03-26 09:46:41 +01:00
max-bucket-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
min-bucket-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
movfn-aggregation.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
moving-percentiles-aggregation.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
normalize-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
percentiles-bucket-aggregation.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
serial-diff-aggregation.asciidoc [DOCS] Fix double spaces (#71082) 2021-03-31 09:57:47 -04:00
stats-bucket-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00
sum-bucket-aggregation.asciidoc [DOCS] Change agg titles to sentence case (#64425) 2020-10-30 13:25:21 -04:00