mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 15:47:23 -04:00
* Soft-deprecation of point/geo_point formats Since GeoJSON and WKT are now common formats for all three types: geo_shape, geo_point and point We decided to soft-deprecate the other point formats by ordering: * GeoJSON (object with keys `type` and `coordinates`) * WKT `POINT(x y)` * Object with keys `lat` and `lon` (or `x` and `y` for point) * Array [lon,lat] * String `"lat,lon"` (or `"x,y"` in point) * String with geohash (only in `geo_point`) The geohash is last because it is only in one field type. The string version is second last because it is the most controversial being the only version to reverse the coordinate order from all other formats (for geo_point only, since the coordinates are not reversed in point). In addition we replaced many examples in both documentation and tests to prioritize WKT over the plain string format. Many remaining examples of array format or object with keys still exist and could be replaced by, for example, GeoJSON, if we feel the need. * Incorrect quote position
418 lines
9.2 KiB
Text
418 lines
9.2 KiB
Text
[role="xpack"]
|
|
[[search-aggregations-metrics-top-metrics]]
|
|
=== Top metrics aggregation
|
|
++++
|
|
<titleabbrev>Top metrics</titleabbrev>
|
|
++++
|
|
|
|
The `top_metrics` aggregation selects metrics from the document with the largest or smallest "sort"
|
|
value. For example, this gets the value of the `m` field on the document with the largest value of `s`:
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-simple]
|
|
----
|
|
POST /test/_bulk?refresh
|
|
{"index": {}}
|
|
{"s": 1, "m": 3.1415}
|
|
{"index": {}}
|
|
{"s": 2, "m": 1.0}
|
|
{"index": {}}
|
|
{"s": 3, "m": 2.71828}
|
|
POST /test/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "m"},
|
|
"sort": {"s": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {"sort": [3], "metrics": {"m": 2.718280076980591 } } ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
`top_metrics` is fairly similar to <<search-aggregations-metrics-top-hits-aggregation, `top_hits`>>
|
|
in spirit but because it is more limited it is able to do its job using less memory and is often
|
|
faster.
|
|
|
|
==== `sort`
|
|
|
|
The `sort` field in the metric request functions exactly the same as the `sort` field in the
|
|
<<sort-search-results, search>> request except:
|
|
|
|
* It can't be used on <<binary,binary>>, <<flattened,flattened>>, <<ip,ip>>,
|
|
<<keyword,keyword>>, or <<text,text>> fields.
|
|
* It only supports a single sort value so which document wins ties is not specified.
|
|
|
|
The metrics that the aggregation returns is the first hit that would be returned by the search
|
|
request. So,
|
|
|
|
`"sort": {"s": "desc"}`:: gets metrics from the document with the highest `s`
|
|
`"sort": {"s": "asc"}`:: gets the metrics from the document with the lowest `s`
|
|
`"sort": {"_geo_distance": {"location": "POINT (-78.6382 35.7796)"}}`::
|
|
gets metrics from the documents with `location` *closest* to `35.7796, -78.6382`
|
|
`"sort": "_score"`:: gets metrics from the document with the highest score
|
|
|
|
==== `metrics`
|
|
|
|
`metrics` selects the fields of the "top" document to return. You can request
|
|
a single metric with something like `"metrics": {"field": "m"}` or multiple
|
|
metrics by requesting a list of metrics like `"metrics": [{"field": "m"}, {"field": "i"}`.
|
|
|
|
`metrics.field` supports the following field types:
|
|
|
|
* <<boolean,`boolean`>>
|
|
* <<ip,`ip`>>
|
|
* <<keyword,keywords>>
|
|
* <<number,numbers>>
|
|
|
|
Except for keywords, <<runtime,runtime fields>> for corresponding types are also
|
|
supported. `metrics.field` doesn't support fields with <<array,array values>>. A
|
|
`top_metric` aggregation on array values may return inconsistent results.
|
|
|
|
The following example runs a `top_metrics` aggregation on several field types.
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-list-of-metrics]
|
|
----
|
|
PUT /test
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"d": {"type": "date"}
|
|
}
|
|
}
|
|
}
|
|
POST /test/_bulk?refresh
|
|
{"index": {}}
|
|
{"s": 1, "m": 3.1415, "i": 1, "d": "2020-01-01T00:12:12Z", "t": "cat"}
|
|
{"index": {}}
|
|
{"s": 2, "m": 1.0, "i": 6, "d": "2020-01-02T00:12:12Z", "t": "dog"}
|
|
{"index": {}}
|
|
{"s": 3, "m": 2.71828, "i": -12, "d": "2019-12-31T00:12:12Z", "t": "chicken"}
|
|
POST /test/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": [
|
|
{"field": "m"},
|
|
{"field": "i"},
|
|
{"field": "d"},
|
|
{"field": "t.keyword"}
|
|
],
|
|
"sort": {"s": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {
|
|
"sort": [3],
|
|
"metrics": {
|
|
"m": 2.718280076980591,
|
|
"i": -12,
|
|
"d": "2019-12-31T00:12:12.000Z",
|
|
"t.keyword": "chicken"
|
|
}
|
|
} ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
==== `size`
|
|
|
|
`top_metrics` can return the top few document's worth of metrics using the size parameter:
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-size]
|
|
----
|
|
POST /test/_bulk?refresh
|
|
{"index": {}}
|
|
{"s": 1, "m": 3.1415}
|
|
{"index": {}}
|
|
{"s": 2, "m": 1.0}
|
|
{"index": {}}
|
|
{"s": 3, "m": 2.71828}
|
|
POST /test/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "m"},
|
|
"sort": {"s": "desc"},
|
|
"size": 3
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [
|
|
{"sort": [3], "metrics": {"m": 2.718280076980591 } },
|
|
{"sort": [2], "metrics": {"m": 1.0 } },
|
|
{"sort": [1], "metrics": {"m": 3.1414999961853027 } }
|
|
]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
The default `size` is 1. The maximum default size is `10` because the aggregation's
|
|
working storage is "dense", meaning we allocate `size` slots for every bucket. `10`
|
|
is a *very* conservative default maximum and you can raise it if you need to by
|
|
changing the `top_metrics_max_size` index setting. But know that large sizes can
|
|
take a fair bit of memory, especially if they are inside of an aggregation which
|
|
makes many buckes like a large
|
|
<<search-aggregations-metrics-top-metrics-example-terms, terms aggregation>>. If
|
|
you till want to raise it, use something like:
|
|
|
|
[source,console]
|
|
----
|
|
PUT /test/_settings
|
|
{
|
|
"top_metrics_max_size": 100
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
|
|
NOTE: If `size` is more than `1` the `top_metrics` aggregation can't be the *target* of a sort.
|
|
|
|
==== Examples
|
|
|
|
[[search-aggregations-metrics-top-metrics-example-terms]]
|
|
===== Use with terms
|
|
|
|
This aggregation should be quite useful inside of <<search-aggregations-bucket-terms-aggregation, `terms`>>
|
|
aggregation, to, say, find the last value reported by each server.
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-terms]
|
|
----
|
|
PUT /node
|
|
{
|
|
"mappings": {
|
|
"properties": {
|
|
"ip": {"type": "ip"},
|
|
"date": {"type": "date"}
|
|
}
|
|
}
|
|
}
|
|
POST /node/_bulk?refresh
|
|
{"index": {}}
|
|
{"ip": "192.168.0.1", "date": "2020-01-01T01:01:01", "m": 1}
|
|
{"index": {}}
|
|
{"ip": "192.168.0.1", "date": "2020-01-01T02:01:01", "m": 2}
|
|
{"index": {}}
|
|
{"ip": "192.168.0.2", "date": "2020-01-01T02:01:01", "m": 3}
|
|
POST /node/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"ip": {
|
|
"terms": {
|
|
"field": "ip"
|
|
},
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "m"},
|
|
"sort": {"date": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"ip": {
|
|
"buckets": [
|
|
{
|
|
"key": "192.168.0.1",
|
|
"doc_count": 2,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 2 } } ]
|
|
}
|
|
},
|
|
{
|
|
"key": "192.168.0.2",
|
|
"doc_count": 1,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 3 } } ]
|
|
}
|
|
}
|
|
],
|
|
"doc_count_error_upper_bound": 0,
|
|
"sum_other_doc_count": 0
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
Unlike `top_hits`, you can sort buckets by the results of this metric:
|
|
|
|
[source,console]
|
|
----
|
|
POST /node/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"ip": {
|
|
"terms": {
|
|
"field": "ip",
|
|
"order": {"tm.m": "desc"}
|
|
},
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "m"},
|
|
"sort": {"date": "desc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"ip": {
|
|
"buckets": [
|
|
{
|
|
"key": "192.168.0.2",
|
|
"doc_count": 1,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 3 } } ]
|
|
}
|
|
},
|
|
{
|
|
"key": "192.168.0.1",
|
|
"doc_count": 2,
|
|
"tm": {
|
|
"top": [ {"sort": ["2020-01-01T02:01:01.000Z"], "metrics": {"m": 2 } } ]
|
|
}
|
|
}
|
|
],
|
|
"doc_count_error_upper_bound": 0,
|
|
"sum_other_doc_count": 0
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
===== Mixed sort types
|
|
|
|
Sorting `top_metrics` by a field that has different types across different
|
|
indices producs somewhat surprising results: floating point fields are
|
|
always sorted independently of whole numbered fields.
|
|
|
|
[source,console,id=search-aggregations-metrics-top-metrics-mixed-sort]
|
|
----
|
|
POST /test/_bulk?refresh
|
|
{"index": {"_index": "test1"}}
|
|
{"s": 1, "m": 3.1415}
|
|
{"index": {"_index": "test1"}}
|
|
{"s": 2, "m": 1}
|
|
{"index": {"_index": "test2"}}
|
|
{"s": 3.1, "m": 2.71828}
|
|
POST /test*/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "m"},
|
|
"sort": {"s": "asc"}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
|
|
Which returns:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {"sort": [3.0999999046325684], "metrics": {"m": 2.718280076980591 } } ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|
|
|
|
While this is better than an error it *probably* isn't what you were going for.
|
|
While it does lose some precision, you can explicitly cast the whole number
|
|
fields to floating points with something like:
|
|
|
|
[source,console]
|
|
----
|
|
POST /test*/_search?filter_path=aggregations
|
|
{
|
|
"aggs": {
|
|
"tm": {
|
|
"top_metrics": {
|
|
"metrics": {"field": "m"},
|
|
"sort": {"s": {"order": "asc", "numeric_type": "double"}}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[continued]
|
|
|
|
Which returns the much more expected:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"aggregations": {
|
|
"tm": {
|
|
"top": [ {"sort": [1.0], "metrics": {"m": 3.1414999961853027 } } ]
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TESTRESPONSE
|