mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 07:37:19 -04:00
Related to issue #77823 This does the following: - Updates several asciidoc files that contained code snippets with invalid JSON, most involving unnecessary trailing commas. - Makes the switch from the Groovy JSON parser to the Jackson parser, pursuant to the general goal of eliminating Groovy dependence. - Makes testing of JSON validity at build time more strict. Note that this update still allows backslash escaping for any character. Currently that matters because of the file "docs/reference/ml/anomaly-detection/apis/get-datafeed-stats.asciidoc", specifically this part: "attributes" : { "ml.machine_memory" : "$body.datafeeds.0.node.attributes.ml\.machine_memory", "ml.max_open_jobs" : "512" } It's not clear to me what change, if any, is appropriate there. So, I've left in the escaped period and configured the parser to ignore it for the time being.
190 lines
5.4 KiB
Text
190 lines
5.4 KiB
Text
[[search-aggregations-pipeline-bucket-sort-aggregation]]
|
|
=== Bucket sort aggregation
|
|
++++
|
|
<titleabbrev>Bucket sort</titleabbrev>
|
|
++++
|
|
|
|
A parent pipeline aggregation which sorts the buckets of its parent multi-bucket aggregation.
|
|
Zero or more sort fields may be specified together with the corresponding sort order.
|
|
Each bucket may be sorted based on its `_key`, `_count` or its sub-aggregations.
|
|
In addition, parameters `from` and `size` may be set in order to truncate the result buckets.
|
|
|
|
NOTE: The `bucket_sort` aggregation, like all pipeline aggregations, is executed after all other non-pipeline aggregations.
|
|
This means the sorting only applies to whatever buckets are already returned from the parent aggregation. For example,
|
|
if the parent aggregation is `terms` and its `size` is set to `10`, the `bucket_sort` will only sort over those 10
|
|
returned term buckets.
|
|
|
|
==== Syntax
|
|
|
|
A `bucket_sort` aggregation looks like this in isolation:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"bucket_sort": {
|
|
"sort": [
|
|
{ "sort_field_1": { "order": "asc" } }, <1>
|
|
{ "sort_field_2": { "order": "desc" } },
|
|
"sort_field_3"
|
|
],
|
|
"from": 1,
|
|
"size": 3
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
<1> Here, `sort_field_1` is the bucket path to the variable to be used as the primary sort and its order
|
|
is ascending.
|
|
|
|
[[bucket-sort-params]]
|
|
.`bucket_sort` Parameters
|
|
[options="header"]
|
|
|===
|
|
|Parameter Name |Description |Required |Default Value
|
|
|`sort` |The list of fields to sort on. See <<sort-search-results,`sort`>> for more details. |Optional |
|
|
|`from` |Buckets in positions prior to the set value will be truncated. |Optional | `0`
|
|
|`size` |The number of buckets to return. Defaults to all buckets of the parent aggregation. |Optional |
|
|
|`gap_policy` |The policy to apply when gaps are found in the data (see <<gap-policy>> for more
|
|
details)|Optional |`skip`
|
|
|===
|
|
|
|
The following snippet returns the buckets corresponding to the 3 months with the highest total sales in descending order:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /sales/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"sales_per_month": {
|
|
"date_histogram": {
|
|
"field": "date",
|
|
"calendar_interval": "month"
|
|
},
|
|
"aggs": {
|
|
"total_sales": {
|
|
"sum": {
|
|
"field": "price"
|
|
}
|
|
},
|
|
"sales_bucket_sort": {
|
|
"bucket_sort": {
|
|
"sort": [
|
|
{ "total_sales": { "order": "desc" } } <1>
|
|
],
|
|
"size": 3 <2>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
|
|
<1> `sort` is set to use the values of `total_sales` in descending order
|
|
<2> `size` is set to `3` meaning only the top 3 months in `total_sales` will be returned
|
|
|
|
And the following may be the response:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"took": 82,
|
|
"timed_out": false,
|
|
"_shards": ...,
|
|
"hits": ...,
|
|
"aggregations": {
|
|
"sales_per_month": {
|
|
"buckets": [
|
|
{
|
|
"key_as_string": "2015/01/01 00:00:00",
|
|
"key": 1420070400000,
|
|
"doc_count": 3,
|
|
"total_sales": {
|
|
"value": 550.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string": "2015/03/01 00:00:00",
|
|
"key": 1425168000000,
|
|
"doc_count": 2,
|
|
"total_sales": {
|
|
"value": 375.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string": "2015/02/01 00:00:00",
|
|
"key": 1422748800000,
|
|
"doc_count": 2,
|
|
"total_sales": {
|
|
"value": 60.0
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/"took": 82/"took": $body.took/]
|
|
// TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
|
|
// TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]
|
|
|
|
==== Truncating without sorting
|
|
|
|
It is also possible to use this aggregation in order to truncate the result buckets
|
|
without doing any sorting. To do so, just use the `from` and/or `size` parameters
|
|
without specifying `sort`.
|
|
|
|
The following example simply truncates the result so that only the second bucket is returned:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /sales/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"sales_per_month": {
|
|
"date_histogram": {
|
|
"field": "date",
|
|
"calendar_interval": "month"
|
|
},
|
|
"aggs": {
|
|
"bucket_truncate": {
|
|
"bucket_sort": {
|
|
"from": 1,
|
|
"size": 1
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
|
|
Response:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"took": 11,
|
|
"timed_out": false,
|
|
"_shards": ...,
|
|
"hits": ...,
|
|
"aggregations": {
|
|
"sales_per_month": {
|
|
"buckets": [
|
|
{
|
|
"key_as_string": "2015/02/01 00:00:00",
|
|
"key": 1422748800000,
|
|
"doc_count": 2
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/"took": 11/"took": $body.took/]
|
|
// TESTRESPONSE[s/"_shards": \.\.\./"_shards": $body._shards/]
|
|
// TESTRESPONSE[s/"hits": \.\.\./"hits": $body.hits/]
|