Commit graph

507 commits

Author SHA1 Message Date
Simon Cooper
9a80a37d78
[7.17] Use CLDR locale provider on JDK 23 (#110222) (#112544)
Backports #110222 to 7.17.

JDK 23 removes the COMPAT locale provider, leaving CLDR as the only option. This commit configures Elasticsearch
to use the CLDR provider when on JDK 23, but still use the existing COMPAT provider when on JDK 22 and below.

This causes some differences in locale behaviour; this also adapts various tests to still work whether run on COMPAT or CLDR.
2024-09-06 16:24:37 +01:00
Craig Taverner
3bdd3fc6ed
Incorrect name for sort field (#97328) (#97330) 2023-07-03 11:22:37 -04:00
Craig Taverner
edb34ad60b
Correct rare-terms default precision in docs (#96887) (#96907) 2023-06-18 16:08:27 -04:00
Umut Uz
282129dba0 Remove duplicate text from cardinality aggs docs (#86615)
The same explanation is repeated twice within a section.
2022-05-19 12:00:03 -07:00
Nik Everett
17f5cc87bd
Backport doc fixes to 7.17 (#84722)
* Update painless-reindex-context.asciidoc (#84444) (#84712)

ctx['op'] should be set to 'noop', not 'none' when specifying no
operation.

Elasticsearch error when using 'none':

```json
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "Operation type [none] not allowed, only [noop, index, delete] are allowed"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "Operation type [none] not allowed, only [noop, index, delete] are allowed"
  },
  "status" : 400
}
```

Co-authored-by: jalvar08 <jeovanny.alvarez@gmail.com>

* [DOCS] Update install instructions for Debian/Ubuntu (#84645) (#84714)

The use of `apt-key` is deprecated and will no longer be available after
Debian 11 and Ubuntu 22.04. This updates the installation instructions
for Debian-based distributions.

Closes #84644

Co-authored-by: er0k <er0k@users.noreply.github.com>

* Fix some typos in plugins & reference docs (#84667) (#84717)

This pull request removes a few instances of duplicate words or
punctuation and erroneous spelling from the docs.

Co-authored-by: Abele Mălan <6689720+AbeleMM@users.noreply.github.com>

Co-authored-by: jalvar08 <jeovanny.alvarez@gmail.com>
Co-authored-by: er0k <er0k@users.noreply.github.com>
Co-authored-by: Abele Mălan <6689720+AbeleMM@users.noreply.github.com>
2022-03-07 14:04:12 -05:00
James Rodewig
1a7e827597
[DOCS] Update sum aggregation for histograms (#84493) (#84498)
Fixes an error and test snippets for the sum aggregation example for histograms.

Closes #84491

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
(cherry picked from commit fb45ac9dea)

Co-authored-by: Maja Grubic <maja.grubic@elastic.co>
2022-03-01 08:42:57 -05:00
Lisa Cawley
a7fc055158
[DOCS] Fix nesting in bucket correlation aggregation (#83816) (#83849) 2022-02-11 11:33:37 -08:00
James Rodewig
86ea2fe093
[DOCS] Remove unneeded callouts from snippets (#83798) (#83806)
These callouts aren't referenced anywhere. Leaving them in can be confusing.

(cherry picked from commit d31bdd6bf4)
2022-02-10 15:17:45 -05:00
James Rodewig
034de18997
[DOCS] Fix min/max agg snippets for histograms (#83695) (#83700)
* Updates the `min` and `max` snippets for histograms. These should now run as docs integration tests.
* Fixes a copy/paste error in the `max` aggregation snippet for histograms.

Relates to https://github.com/elastic/elasticsearch/pull/83384

(cherry picked from commit 280fd2fff7)
2022-02-08 20:06:36 -05:00
James Rodewig
82ee85bac7
[DOCS] Re-add paragraph noting doc_count is approximate (#83154) (#83157)
This paragraph was accidentally removed as part of #79205. Also fixes a minor heading capitalization error.

(cherry picked from commit 63f228e24e)
2022-01-26 11:19:47 -05:00
James Rodewig
1adc42453b
[DOCS] Fix typo (#82344) (#82381)
(cherry picked from commit 129d0fc91d)

Co-authored-by: Oleks <oleks@users.noreply.github.com>
2022-01-10 13:48:09 -05:00
James Rodewig
6e99c86793
[DOCS] Remove experimental language from HDR Histo percentiles/ranks (#81773) (#81785)
per issue 60780, decision from team to remove experimental language from HDR Histogram percentiles and ranks. Feature has been in production for quite some time.
closes #60780

Co-authored-by: William Chaparro <william.chaparro@elastic.co>
2021-12-15 15:33:25 -05:00
James Rodewig
398a198cf2
[DOCS] Clarify supported parameters for terms value source (#81775) (#81778)
The composite aggregation's `terms` value source doesn't support the same set of
parameters as the `terms` aggregation.

Closes #81431.
2021-12-15 14:44:56 -05:00
Salvatore Campagna
d64e8674b9
[DOCS] Fix the weighed average documentation (#81307) (#81417)
The documentations states that if the `weight` field is missing, and no
explicit missing configuration is provided, a default value of 1 is used.
This is incorrect and does not match the implementation of the weighted
average aggregator. In this specific case the document is skipped, instead.
2021-12-07 05:21:16 -05:00
James Rodewig
eaec0d7447
[DOCS] Fix typo in gap_policy's default value for serial differencing aggregation (#80893) (#80914)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: Simon Stücher <stchr@users.noreply.github.com>
2021-11-22 13:45:34 -05:00
James Rodewig
07ac8818b6
[DOCS] Remove testenv annotations from doc snippet tests (#80023) (#80458)
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible.

Relates to #79309, #31619
# Conflicts:
#	docs/reference/ml/df-analytics/apis/get-trained-model-deployment-stats.asciidoc
#	docs/reference/ml/df-analytics/apis/infer-trained-model-deployment.asciidoc
#	docs/reference/ml/df-analytics/apis/put-trained-model-definition-part.asciidoc
#	docs/reference/ml/df-analytics/apis/put-trained-model-vocabulary.asciidoc
#	docs/reference/ml/df-analytics/apis/start-trained-model-deployment.asciidoc
#	docs/reference/ml/df-analytics/apis/stop-trained-model-deployment.asciidoc
#	docs/reference/slm/apis/slm-delete.asciidoc
#	docs/reference/slm/apis/slm-execute-retention.asciidoc
#	docs/reference/slm/apis/slm-execute.asciidoc
#	docs/reference/slm/apis/slm-get-status.asciidoc
#	docs/reference/slm/apis/slm-get.asciidoc
#	docs/reference/slm/apis/slm-start.asciidoc
#	docs/reference/slm/apis/slm-stats.asciidoc
#	docs/reference/slm/apis/slm-stop.asciidoc
#	docs/reference/sql/endpoints/client-apps/tableau-desktop.asciidoc
#	docs/reference/sql/endpoints/client-apps/tableau-server.asciidoc
2021-11-05 19:41:54 -04:00
Nik Everett
70f286bc72
Rework docs for the size of terms agg (backport of #79205) (#80223) (#80226)
The `terms` agg picks the top `size` terms in a single scatter/gather
pass across all the shards. For the default `order` and if you `order`
by `_key` this works quite well. Some errors creep in, but it's fairly
easy to point to them and understand them. But ordering by doc count
ascending is like inviting the error vampire into your agg. It's super
easy to get inaccurate results. This updates the docs to be more stark
about it. Closes #72684
2021-11-02 17:04:05 -04:00
Benjamin Trent
fdfa016b17
[ML] fail on poor configuration for categorize_text (#79586) (#79642)
This commit fixes a handful of bugs with categorize_text agg

 - The agg now fails on fields that are not text fields
 - Limits the number of tokens categorized
 - Validates the configuration inputs to disallow settings above static maximums
2021-10-21 13:13:04 -04:00
Christos Soulios
ffc61a2f06
[7.x] Fix rate agg with custom _doc_count (#79449)
Backports #79346 to 7.x

    When running a rate aggregation without setting the field parameter, the result is computed based on the bucket doc_count.

    This PR adds support for a custom _doc_count field.

    Closes #77734
2021-10-19 14:48:50 +03:00
Benjamin Trent
700bf755bc
[ML] add new normalize_above parameter to p_value significant terms heuristic (#78833) (#78999)
This commit adds the new normalize_above parameter to the p_value significant
terms heuristic.

This parameter allows for consistent significance results at various scales. When a total count (in or out of the set background set) is above the normalize_above parameter, both the total set and the set including the term are scaled by normalize_above/count where count is term in the set or total set size.
2021-10-12 14:49:17 -04:00
James Rodewig
fe79373c10
[DOCS] Add prod warning to composite agg (#78723) (#78778)
The composite aggregation is considered expensive. Users should perform load testing before deploying it in production.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: Stef Nestor <steffanie.nestor@gmail.com>
2021-10-06 13:54:38 -04:00
Benjamin Trent
0782ae7427
[7.x] [ML] Text/Log categorization multi-bucket aggregation (#71752) (#78623)
* [ML] Text/Log categorization multi-bucket aggregation (#71752)

This commit adds a new multi-bucket aggregation: `categorize_text`

The aggregation follows a similar design to significant text in that it reads from `_source`
and re-analyzes the the text as it is read. 

Key difference is that it does not use the indexed field's analyzer, but instead relies on 
the `ml_standard` tokenizer with specialized ML token filters. The tokenizer + filters are the
same that machine learning categorization anomaly jobs utilize.

The high level logical flow is as follows:
 - at each shard, read in the text field with a custom analyzer using `ml_standard` tokenizer
 - Read in the particular tokens from the analyzer
 - Feed these tokens to a token tree algorithm (an adaptation of the drain categorization algorithm)
 - Gather the individual log categories (the leaf nodes), sort them by doc_count, ship those buckets to be merged
 - Merge all buckets that have the EXACT same key
 - Once all buckets are merged, pass those keys + counts to a new token tree for additional merging
 - That tree builds the final buckets and that is returned to the user

Algorithm explanation:

 - Each log is parsed with the ml-standard tokenizer
 - each token is passed into a token tree
 - For `max_match_token` each token is stored in the tree and at `max_match_token+1` (or `len(tokens)`) a log group is created
 - If another log group exists at that leaf, merge it if they have `similarity_threshold` percentage of tokens in common
     - merging simply replaces tokens that are different in the group with `*`
 - If a layer in the tree has `max_unique_tokens` we add a `*` child and any new tokens are passed through there. Catch here is that on the final merge, we first attempt to merge together subtrees with the smallest number of documents. Especially if the new sub tree has more documents counted.

## Aggregation configuration.

Here is an example on some openstack logs
```js
POST openstack/_search?size=0
{
  "aggs": {
    "categories": {
      "categorize_text": {
        "field": "message", // The field to categorize
        "similarity_threshold": 20, // merge log groups if they are this similar
        "max_unique_tokens": 20, // Max Number of children per token position
        "max_match_token": 4, // Maximum tokens to build prefix trees
        "size": 1
      }
    }
  }
}
```

This will return buckets like
```json
"aggregations" : {
    "categories" : {
      "buckets" : [
        {
          "doc_count" : 806,
          "key" : "nova-api.log.1.2017-05-16_13 INFO nova.osapi_compute.wsgi.server * HTTP/1.1 status len time"
        }
      ]
    }
  }
```

* fixing for backport

* fixing test after backport
2021-10-04 13:33:56 -04:00
James Rodewig
4434730453
[DOCS] Status code change for pipeline validation errors (#78324)
Adds a note to the pipeline aggregation docs for error status codes changed with #53669.
2021-09-27 10:50:55 -04:00
Lukas Wegmann
9800df6a60
Document missing_order param for composite aggregations (#77839) (#78302)
Documents the missing_order parameter for composite aggregations introduced in #76740
2021-09-27 04:09:05 -04:00
István Zoltán Szabó
26fdca39c5
[DOCS] Modifies aggregations title abbreviation to follow convention. (#78252) (#78254) 2021-09-23 17:00:41 +02:00
edh-oss
6869f5bc0d Update JSON parser and snippets (#77983)
Related to issue  #77823

This does the following:

- Updates several asciidoc files that contained code snippets with
  invalid JSON, most involving unnecessary trailing commas.

- Makes the switch from the Groovy JSON parser to the Jackson parser,
  pursuant to the general goal of eliminating Groovy dependence.

- Makes testing of JSON validity at build time more strict.

Note that this update still allows backslash escaping for any
character. Currently that matters because of the file
"docs/reference/ml/anomaly-detection/apis/get-datafeed-stats.asciidoc",
specifically this part:

    "attributes" : {
      "ml.machine_memory" :
        "$body.datafeeds.0.node.attributes.ml\.machine_memory",
      "ml.max_open_jobs" : "512"
    }

It's not clear to me what change, if any, is appropriate there. So,
I've left in the escaped period and configured the parser to ignore
it for the time being.
2021-09-20 11:11:54 +01:00
James Rodewig
76859ec56c
[DOCS] Fix calendar interval in snippet callout(#77314) (#77579)
Co-authored-by: Guillaume Le Floch <glfloch@gmail.com>
2021-09-10 14:52:37 -04:00
James Rodewig
01ee3d6890
[DOCS] Include index in range agg snippets (#77290) (#77569)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: xiaozhiliaoo(小知了) <772654204@qq.com>
2021-09-10 12:31:55 -04:00
Benjamin Trent
4e6cc13d0c
[7.x] Adds support for the rate aggregation under a composite agg (#76992) (#77113)
* Adds support for the rate aggregation under a composite agg (#76992)

rate aggregation should support being a sub-aggregation
of a composite agg.

The catch is that the composite aggregation source
must be a date histogram. Other sources can be present
but their must be exactly one date histogram source
otherwise the rate aggregation does not know which
interval to compare its unit rate to.

closes https://github.com/elastic/elasticsearch/issues/76988

* Update RateAggregatorTests.java
2021-09-01 08:53:34 -04:00
James Rodewig
f92cb78520
[DOCS] Add filter example to nested agg docs (#76118) (#76177)
Changes:
* Simplifies and formats several snippets in the nested agg docs
* Adds a `filter` sub-aggregration example
# Conflicts:
#	docs/reference/aggregations/bucket/nested-aggregation.asciidoc
2021-08-05 10:03:18 -04:00
James Rodewig
4d881f57e1
[DOCS] Correct spelling for geo terms (#76028) (#76032)
Changes:
* Use "geopoint" when not referring to the literal field type
* Use "geoshape" when not referring to the literal field type or query type
* Use "GeoJSON" consistently
# Conflicts:
#	docs/reference/ingest/processors/enrich.asciidoc
2021-08-03 10:08:52 -04:00
István Zoltán Szabó
bcace7d30b
[DOCS] Adds p-value heuristic to significant terms aggregation (#75369) (#75721)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2021-07-27 09:57:44 +02:00
Benjamin Trent
7f4df1632f
[7.x] Add support for range aggregations on histogram mapped fields (#74146) (#74682)
* Add support for range aggregations on histogram mapped fields (#74146)

This adds support for the range aggregation over `histogram` mapped fields.

Decisions made for implementation:

 - Sub-aggregations are not allowed. This is to simplify implementation and follows the prior art set by the `histogram` aggregation
 - Nothing fancy is done with the ranges. No filter translations as we cannot easily do a `range` filter query against histogram fields. This may be an optimization in the future.
 - Ranges check the histogram value ONLY. No interpolation of values is done. If we have better statistics around the histogram this MAY be possible.
2021-06-29 08:45:51 -04:00
James Rodewig
0a5d4e740c [DOCS] Deduplicate docs for search.max_buckets 2021-06-29 08:42:42 -04:00
Nik Everett
fc52651f0d
Document types terms agg can consume (#73272) (#74258)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-06-17 15:12:40 -04:00
Igor Motov
d1e6e93544
Add keep_values gap policy (#73297) (#73927)
Adds a new keep_values gap policy that works like skip, except if the metric
calculated on an empty bucket provides a non-null non-NaN value, this value is
used for the bucket.

Fixes #27377

Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
2021-06-08 13:37:34 -10:00
James Rodewig
9ec8d4c5aa
[DOCS] Clarify supported fields for top_metrics agg (#73907) (#73916)
Changes:
* Notes `metrics.field` supports `boolean` fields and runtime fields.
* Notes `metrics.field` doesn't support array values.

Closes #72889
2021-06-08 13:30:41 -04:00
James Rodewig
51bb214bfa
[DOCS] Make doc_count error docs more searchable (#73870) (#73902)
Changes:
* Combines the `Document counts are approximate` and `Calculating document count
  error` sections.
* Rewrites the section to include `sum_other_doc_count` and
  `doc_count_error_upper_bound` for easier on-page (ctrl+f) searching.

Closes #73200
2021-06-08 09:50:10 -04:00
Mark Tozzi
47d3d6a6d4
Docvalueformat errors (#73121) (#73863)
Improve the error message when inconsistent mappings cause doc value formatting errors.  For example, trying to format a binary encoded IP address as a UTF8 string often fails with something unexpected, like `ArrayIndexOutOfBounds`.  This change catches that and wraps it with a message suggesting the user check their mappings.  Also gets rid of anonymous instances for doc value formatters, which made it hard to see what format was failing to be applied.
2021-06-07 16:11:38 -04:00
Benjamin Trent
3392358f71
[ML] adding new KS test pipeline aggregation (#73334) (#73782)
This adds a new pipeline aggregation for calculating Kolmogorov–Smirnov test for a given sample and buckets path.

For now, the buckets path resolution needs to be `_count`. But, this may be relaxed in the future. 

It accepts a parameter `fractions` that indicates the distribution of documents from some other pre-calculated sample. 

This particular version of the K-S test is Two-sample, meaning, it calculates if the `fractions` and the distribution of `_count` values in the buckets_path are taken from the same distribution.

This in combination with the hypothesis alternatives (`less`, `greater`, `two_sided`) and sampling logic (`upper_tail`, `lower_tail`, `uniform`) allow for flexibility and usefulness when comparing two samples and determining the likelihood of them being from the same overall distribution.

Usage:

```
POST correlate_latency/_search?size=0&filter_path=aggregations
{
  "aggs": {
    "buckets": {
      "terms": { <1>
        "field": "version",
        "size": 2
      },
      "aggs": {
        "latency_ranges": {
          "range": { <2>
            "field": "latency",
            "ranges": [
              { "to": 0.0 },
              { "from": 0, "to": 105 },
              { "from": 105, "to": 225 },
              { "from": 225, "to": 445 },
              { "from": 445, "to": 665 },
              { "from": 665, "to": 885 },
              { "from": 885, "to": 1115 },
              { "from": 1115, "to": 1335 },
              { "from": 1335, "to": 1555 },
              { "from": 1555, "to": 1775 },
              { "from": 1775 }
            ]
          }
        },
        "ks_test": { <3>
          "bucket_count_ks_test": {
            "buckets_path": "latency_ranges>_count",
            "alternative": ["less", "greater", "two_sided"]
          }
        }
      }
    }
  }
}
```

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2021-06-07 11:42:20 -04:00
Nik Everett
70e7946e7e
More debugging info for significant_text (backport of #72727) (#72895)
Adds some extra debugging information to make it clear that you are
running `significant_text`. Also adds some using timing information
around the `_source` fetch and the `terms` accumulation. This lets you
calculate a third useful timing number: the analysis time. It is
`collect_ns - fetch_ns - accumulation_ns`.

This also adds a half dozen extra REST tests to get a *fairly*
comprehensive set of the operations this supports. It doesn't cover all
of the significance heuristic parsing, but its certainly much better
than what we had.
2021-05-11 08:20:25 -04:00
Benjamin Trent
374f995e4e
[7.x] [ML] add new bucket_correlation aggregation with initial count_correlation function (#72133) (#72896)
* [ML] add new bucket_correlation aggregation with initial count_correlation function (#72133)

This commit adds a new pipeline aggregation that allows correlation within the aggregation frame work in bucketed values.

The initial function is a `count_correlation` function. The purpose of which is to correlate the count in a consistent number of buckets with a pre calculated indicator. The indicator and the aggregated buckets should related to the same metrics with in documents.

Example for correlating terms within a `service.version.keyword` with latency percentiles. The percentiles and provided correlation indicator both refer to the same source data where the indicator was previously calculated.:
```
GET apm-7.12.0-transaction-generated/_search
{
  "size": 0,
  "aggs": {
    "field_terms": {
      "terms": {
        "field": "service.version.keyword",
        "size": 20
      },
      "aggs": {
        "latency_range": {
          "range": {
            "field": "transaction.duration.us",
            "ranges": [<snip>],
            "keyed": true
          }
        },
        "correlation": {
          "bucket_correlation": {
            "buckets_path": "latency_range>_count",
            "count_correlation": {
              "indicator": {
                 "expectations": [<snip>],
                 "doc_count": 20000
               }
            }
          }
        }
      }
    }
  }
}
```
2021-05-10 14:34:21 -04:00
Nik Everett
9a9950e9f2
Update docs for filter agg (backport of #72508) (#72828)
The docs for the `filter` agg seemed to suggest that it was the
preferred way to filter results for aggs but its really mostly for when
you need to filter things under another bucketing agg.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-05-06 15:07:41 -04:00
Ignacio Vera
c6aab5ffcc
[GeoPoint] Grid aggregations with bounds should exclude touching tiles (#72493) (#72520) 2021-04-30 09:51:33 +02:00
James Rodewig
d84cac0590
[DOCS] Fix typos (#72227) (#72256)
Co-authored-by: Pierre Grimaud <grimaud.pierre@gmail.com>
2021-04-26 14:18:27 -04:00
Nik Everett
121ecb959d
Convert metric aggs docs runtime fields (backport of #71260) (#71298)
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)

Relates to #69291

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-04-05 13:24:19 -04:00
Nik Everett
1b35100ab0
Convert bucket aggs docs to runtime fields (backport #71202) (#71248)
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)

Relates to #69291

Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-04-02 12:40:19 -04:00
James Rodewig
c757f9e4e7
[DOCS] Fix double spaces (#71082) (#71120) 2021-03-31 11:43:34 -04:00
Benjamin Trent
abb182d95c
[7.x] [ML] adding support for composite aggs in anomaly detection (#69970) (#71052)
* [ML] adding support for composite aggs in anomaly detection (#69970)

This commit allows for composite aggregations in datafeeds.

Composite aggs provide a much better solution for having influencers, partitions, etc. on high volume data. Instead of worrying about long scrolls in the datafeed, the calculation is distributed across cluster via the aggregations.

The restrictions for this support are as follows:

- The composite aggregation must have EXACTLY one `date_histogram` source
- The sub-aggs of the composite aggregation must have a `max` aggregation on the SAME timefield as the aforementioned `date_histogram` source
- The composite agg must be the ONLY top level agg and it cannot have a `composite` or `date_histogram` sub-agg
- If using a `date_histogram` to bucket time, it cannot have a `composite` sub-agg.
- The top-level `composite` agg cannot have a sibling pipeline agg. Pipeline aggregations are supported as a sub-agg (thus a pipeline agg INSIDE the bucket).

Some key user interaction differences:
- Speed + resources used by the cluster should be controlled by the `size` parameter in the `composite` aggregation. Previously, we said if you are using aggs, use a specific `chunking_config`. But, with composite, that is not necessary.
- Users really shouldn't use nested `terms` aggs anylonger. While this is still a "valid" configuration and MAY be desirable for some users (only wanting the top 10 of certain terms), typically when users want influencers, partition fields, etc. they want the ENTIRE population. Previously, this really wasn't possible with aggs, with `composite` it is.
- I cannot really think of a typical usecase that SHOULD ever use a multi-bucket aggregation that is NOT supported by composite.
2021-03-30 12:04:54 -04:00
István Zoltán Szabó
591e93397a
[DOCS] Removes beta labels from DFA related docs. (#70808) (#70902) 2021-03-26 10:25:36 +01:00