Commit graph

3 commits

Author SHA1 Message Date
James Rodewig
07ac8818b6
[DOCS] Remove testenv annotations from doc snippet tests (#80023) (#80458)
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible.

Relates to #79309, #31619
# Conflicts:
#	docs/reference/ml/df-analytics/apis/get-trained-model-deployment-stats.asciidoc
#	docs/reference/ml/df-analytics/apis/infer-trained-model-deployment.asciidoc
#	docs/reference/ml/df-analytics/apis/put-trained-model-definition-part.asciidoc
#	docs/reference/ml/df-analytics/apis/put-trained-model-vocabulary.asciidoc
#	docs/reference/ml/df-analytics/apis/start-trained-model-deployment.asciidoc
#	docs/reference/ml/df-analytics/apis/stop-trained-model-deployment.asciidoc
#	docs/reference/slm/apis/slm-delete.asciidoc
#	docs/reference/slm/apis/slm-execute-retention.asciidoc
#	docs/reference/slm/apis/slm-execute.asciidoc
#	docs/reference/slm/apis/slm-get-status.asciidoc
#	docs/reference/slm/apis/slm-get.asciidoc
#	docs/reference/slm/apis/slm-start.asciidoc
#	docs/reference/slm/apis/slm-stats.asciidoc
#	docs/reference/slm/apis/slm-stop.asciidoc
#	docs/reference/sql/endpoints/client-apps/tableau-desktop.asciidoc
#	docs/reference/sql/endpoints/client-apps/tableau-server.asciidoc
2021-11-05 19:41:54 -04:00
István Zoltán Szabó
26fdca39c5
[DOCS] Modifies aggregations title abbreviation to follow convention. (#78252) (#78254) 2021-09-23 17:00:41 +02:00
Benjamin Trent
3392358f71
[ML] adding new KS test pipeline aggregation (#73334) (#73782)
This adds a new pipeline aggregation for calculating Kolmogorov–Smirnov test for a given sample and buckets path.

For now, the buckets path resolution needs to be `_count`. But, this may be relaxed in the future. 

It accepts a parameter `fractions` that indicates the distribution of documents from some other pre-calculated sample. 

This particular version of the K-S test is Two-sample, meaning, it calculates if the `fractions` and the distribution of `_count` values in the buckets_path are taken from the same distribution.

This in combination with the hypothesis alternatives (`less`, `greater`, `two_sided`) and sampling logic (`upper_tail`, `lower_tail`, `uniform`) allow for flexibility and usefulness when comparing two samples and determining the likelihood of them being from the same overall distribution.

Usage:

```
POST correlate_latency/_search?size=0&filter_path=aggregations
{
  "aggs": {
    "buckets": {
      "terms": { <1>
        "field": "version",
        "size": 2
      },
      "aggs": {
        "latency_ranges": {
          "range": { <2>
            "field": "latency",
            "ranges": [
              { "to": 0.0 },
              { "from": 0, "to": 105 },
              { "from": 105, "to": 225 },
              { "from": 225, "to": 445 },
              { "from": 445, "to": 665 },
              { "from": 665, "to": 885 },
              { "from": 885, "to": 1115 },
              { "from": 1115, "to": 1335 },
              { "from": 1335, "to": 1555 },
              { "from": 1555, "to": 1775 },
              { "from": 1775 }
            ]
          }
        },
        "ks_test": { <3>
          "bucket_count_ks_test": {
            "buckets_path": "latency_ranges>_count",
            "alternative": ["less", "greater", "two_sided"]
          }
        }
      }
    }
  }
}
```

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-06-07 11:42:20 -04:00