elasticsearch/docs/reference/aggregations/bucket
Jim Ferenczi 5288235ca3
Optimize the composite aggregation for match_all and range queries (#28745)
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:

```
"composite" : {
  "sources" : [
    { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
  ],
  "size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.

This mode can execute iff:
 * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
 * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
 * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).

If these conditions are not met this aggregation visits each document like any other agg.
2018-03-26 09:51:37 +02:00
..
adjacency-matrix-aggregation.asciidoc Allow _doc as a type. (#27816) 2017-12-14 17:47:53 +01:00
children-aggregation.asciidoc Allow _doc as a type. (#27816) 2017-12-14 17:47:53 +01:00
composite-aggregation.asciidoc Optimize the composite aggregation for match_all and range queries (#28745) 2018-03-26 09:51:37 +02:00
datehistogram-aggregation.asciidoc Allow _doc as a type. (#27816) 2017-12-14 17:47:53 +01:00
daterange-aggregation.asciidoc Document and test date_range "missing" support (#28983) 2018-03-13 12:58:30 -07:00
diversified-sampler-aggregation.asciidoc Enforce that responses in docs are valid json (#26249) 2017-08-17 09:02:10 -04:00
filter-aggregation.asciidoc Update filter-aggregation.asciidoc (#24138) 2017-04-17 18:46:13 -04:00
filters-aggregation.asciidoc Allow _doc as a type. (#27816) 2017-12-14 17:47:53 +01:00
geodistance-aggregation.asciidoc Update aggs reference documentation for 'keyed' options (#23758) 2017-04-18 15:57:50 +02:00
geohashgrid-aggregation.asciidoc Support distance units in GeoHashGrid aggregation precision (#26291) 2017-08-21 17:29:28 +02:00
global-aggregation.asciidoc CONSOLE-ify global-aggregation.asciidoc 2017-01-20 14:36:51 -05:00
histogram-aggregation.asciidoc [DOC] Fix mathematical representation on interval (range) (#27450) 2017-11-21 17:06:26 +00:00
iprange-aggregation.asciidoc Allow _doc as a type. (#27816) 2017-12-14 17:47:53 +01:00
missing-aggregation.asciidoc CONSOLEify some more aggregation docs 2017-05-16 17:25:24 -04:00
nested-aggregation.asciidoc fixing typo in nested-aggregation.asciidoc (#26481) 2017-09-04 06:42:44 +02:00
range-aggregation.asciidoc [Docs] Convert remaining code snippets in docs (#26422) 2017-08-30 12:11:10 +02:00
reverse-nested-aggregation.asciidoc [Doc] Fixs typo in reverse-nested-aggregation.asciidoc (#28348) 2018-01-24 17:54:02 +01:00
sampler-aggregation.asciidoc Update experimental labels in the docs (#25727) 2017-07-18 14:06:22 +02:00
significantterms-aggregation.asciidoc Add a usage example of the JLH score (#28905) 2018-03-06 15:37:18 +01:00
significanttext-aggregation.asciidoc [Docs] Add note on limitation for significant_text with nested objects (#28052) 2018-01-03 16:28:23 +01:00
terms-aggregation.asciidoc Add defined ID to terms agg size header 2018-02-02 13:43:20 +01:00