mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-29 09:54:06 -04:00
This replaces the `script` docs for bucket aggregations with runtime
fields. We expect runtime fields to be nicer to work with because you
can also fetch them or filter on them. We expect them to be faster
because their don't need this sort of `instanceof` tree:
a92a647b9f/server/src/main/java/org/elasticsearch/search/aggregations/support/values/ScriptDoubleValues.java (L42)
Relates to #69291
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
344 lines
9.3 KiB
Text
344 lines
9.3 KiB
Text
[role="xpack"]
|
|
[testenv="basic"]
|
|
[[search-aggregations-metrics-rate-aggregation]]
|
|
=== Rate aggregation
|
|
++++
|
|
<titleabbrev>Rate</titleabbrev>
|
|
++++
|
|
|
|
A `rate` metrics aggregation can be used only inside a `date_histogram` and calculates a rate of documents or a field in each
|
|
`date_histogram` bucket. The field values can be generated extracted from specific numeric or
|
|
<<histogram,histogram fields>> in the documents.
|
|
|
|
==== Syntax
|
|
|
|
A `rate` aggregation looks like this in isolation:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"rate": {
|
|
"unit": "month",
|
|
"field": "requests"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
The following request will group all sales records into monthly bucket and than convert the number of sales transaction in each bucket
|
|
into per annual sales rate.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET sales/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"by_date": {
|
|
"date_histogram": {
|
|
"field": "date",
|
|
"calendar_interval": "month" <1>
|
|
},
|
|
"aggs": {
|
|
"my_rate": {
|
|
"rate": {
|
|
"unit": "year" <2>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
<1> Histogram is grouped by month.
|
|
<2> But the rate is converted into annual rate.
|
|
|
|
The response will return the annual rate of transaction in each bucket. Since there are 12 months per year, the annual rate will
|
|
be automatically calculated by multiplying monthly rate by 12.
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations" : {
|
|
"by_date" : {
|
|
"buckets" : [
|
|
{
|
|
"key_as_string" : "2015/01/01 00:00:00",
|
|
"key" : 1420070400000,
|
|
"doc_count" : 3,
|
|
"my_rate" : {
|
|
"value" : 36.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/02/01 00:00:00",
|
|
"key" : 1422748800000,
|
|
"doc_count" : 2,
|
|
"my_rate" : {
|
|
"value" : 24.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/03/01 00:00:00",
|
|
"key" : 1425168000000,
|
|
"doc_count" : 2,
|
|
"my_rate" : {
|
|
"value" : 24.0
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
|
|
Instead of counting the number of documents, it is also possible to calculate a sum of all values of the fields in the documents in each
|
|
bucket or the number of values in each bucket. The following request will group all sales records into monthly bucket and than calculate
|
|
the total monthly sales and convert them into average daily sales.
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET sales/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"by_date": {
|
|
"date_histogram": {
|
|
"field": "date",
|
|
"calendar_interval": "month" <1>
|
|
},
|
|
"aggs": {
|
|
"avg_price": {
|
|
"rate": {
|
|
"field": "price", <2>
|
|
"unit": "day" <3>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
<1> Histogram is grouped by month.
|
|
<2> Calculate sum of all sale prices
|
|
<3> Convert to average daily sales
|
|
|
|
The response will contain the average daily sale prices for each month.
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations" : {
|
|
"by_date" : {
|
|
"buckets" : [
|
|
{
|
|
"key_as_string" : "2015/01/01 00:00:00",
|
|
"key" : 1420070400000,
|
|
"doc_count" : 3,
|
|
"avg_price" : {
|
|
"value" : 17.741935483870968
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/02/01 00:00:00",
|
|
"key" : 1422748800000,
|
|
"doc_count" : 2,
|
|
"avg_price" : {
|
|
"value" : 2.142857142857143
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/03/01 00:00:00",
|
|
"key" : 1425168000000,
|
|
"doc_count" : 2,
|
|
"avg_price" : {
|
|
"value" : 12.096774193548388
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
|
|
By adding the `mode` parameter with the value `value_count`, we can change the calculation from `sum` to the number of values of the field:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET sales/_search
|
|
{
|
|
"size": 0,
|
|
"aggs": {
|
|
"by_date": {
|
|
"date_histogram": {
|
|
"field": "date",
|
|
"calendar_interval": "month" <1>
|
|
},
|
|
"aggs": {
|
|
"avg_number_of_sales_per_year": {
|
|
"rate": {
|
|
"field": "price", <2>
|
|
"unit": "year", <3>
|
|
"mode": "value_count" <4>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[setup:sales]
|
|
<1> Histogram is grouped by month.
|
|
<2> Calculate number of all sale prices
|
|
<3> Convert to annual counts
|
|
<4> Changing the mode to value count
|
|
|
|
The response will contain the average daily sale prices for each month.
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations" : {
|
|
"by_date" : {
|
|
"buckets" : [
|
|
{
|
|
"key_as_string" : "2015/01/01 00:00:00",
|
|
"key" : 1420070400000,
|
|
"doc_count" : 3,
|
|
"avg_number_of_sales_per_year" : {
|
|
"value" : 36.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/02/01 00:00:00",
|
|
"key" : 1422748800000,
|
|
"doc_count" : 2,
|
|
"avg_number_of_sales_per_year" : {
|
|
"value" : 24.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/03/01 00:00:00",
|
|
"key" : 1425168000000,
|
|
"doc_count" : 2,
|
|
"avg_number_of_sales_per_year" : {
|
|
"value" : 24.0
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|
|
|
|
By default `sum` mode is used.
|
|
|
|
`"mode": "sum"`:: calculate the sum of all values field
|
|
`"mode": "value_count"`:: use the number of values in the field
|
|
|
|
==== Relationship between bucket sizes and rate
|
|
|
|
The `rate` aggregation supports all rate that can be used <<calendar_intervals,calendar_intervals parameter>> of `date_histogram`
|
|
aggregation. The specified rate should compatible with the `date_histogram` aggregation interval, i.e. it should be possible to
|
|
convert the bucket size into the rate. By default the interval of the `date_histogram` is used.
|
|
|
|
`"rate": "second"`:: compatible with all intervals
|
|
`"rate": "minute"`:: compatible with all intervals
|
|
`"rate": "hour"`:: compatible with all intervals
|
|
`"rate": "day"`:: compatible with all intervals
|
|
`"rate": "week"`:: compatible with all intervals
|
|
`"rate": "month"`:: compatible with only with `month`, `quarter` and `year` calendar intervals
|
|
`"rate": "quarter"`:: compatible with only with `month`, `quarter` and `year` calendar intervals
|
|
`"rate": "year"`:: compatible with only with `month`, `quarter` and `year` calendar intervals
|
|
|
|
There is also an additional limitations if the date histogram is not a direct parent of the rate histogram. In this case both rate interval
|
|
and histogram interval have to be in the same group: [`second`, ` minute`, `hour`, `day`, `week`] or [`month`, `quarter`, `year`]. For
|
|
example, if the date histogram is `month` based, only rate intervals of `month`, `quarter` or `year` are supported. If the date histogram
|
|
is `day` based, only `second`, ` minute`, `hour`, `day`, and `week` rate intervals are supported.
|
|
|
|
==== Script
|
|
|
|
If you need to run the aggregation against values that aren't indexed, run the
|
|
aggregation on a <<runtime,runtime field>>. For example, if we need to adjust
|
|
our prices before calculating rates:
|
|
|
|
[source,console]
|
|
----
|
|
GET sales/_search
|
|
{
|
|
"size": 0,
|
|
"runtime_mappings": {
|
|
"price.adjusted": {
|
|
"type": "double",
|
|
"script": {
|
|
"source": "emit(doc['price'].value * params.adjustment)",
|
|
"params": {
|
|
"adjustment": 0.9
|
|
}
|
|
}
|
|
}
|
|
},
|
|
"aggs": {
|
|
"by_date": {
|
|
"date_histogram": {
|
|
"field": "date",
|
|
"calendar_interval": "month"
|
|
},
|
|
"aggs": {
|
|
"avg_price": {
|
|
"rate": {
|
|
"field": "price.adjusted"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[setup:sales]
|
|
|
|
[source,console-result]
|
|
----
|
|
{
|
|
...
|
|
"aggregations" : {
|
|
"by_date" : {
|
|
"buckets" : [
|
|
{
|
|
"key_as_string" : "2015/01/01 00:00:00",
|
|
"key" : 1420070400000,
|
|
"doc_count" : 3,
|
|
"avg_price" : {
|
|
"value" : 495.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/02/01 00:00:00",
|
|
"key" : 1422748800000,
|
|
"doc_count" : 2,
|
|
"avg_price" : {
|
|
"value" : 54.0
|
|
}
|
|
},
|
|
{
|
|
"key_as_string" : "2015/03/01 00:00:00",
|
|
"key" : 1425168000000,
|
|
"doc_count" : 2,
|
|
"avg_price" : {
|
|
"value" : 337.5
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"timed_out": false,"_shards": $body._shards,"hits": $body.hits,/]
|