elasticsearch/docs/reference/aggregations.asciidoc

[[search-aggregations]]
= Aggregations

[partintro]
--
An aggregation summarizes your data as metrics, statistics, or other analytics.
Aggregations help you answer questions like:

* What's the average load time for my website?
* Who are my most valuable customers based on transaction volume?
* What would be considered a large file on my network?
* How many products are in each product category?

{es} organizes aggregations into three categories:

* <<search-aggregations-metrics,Metric>> aggregations that calculate metrics,
such as a sum or average, from field values.

* <<search-aggregations-bucket,Bucket>> aggregations that
group documents into buckets, also called bins, based on field values, ranges,
or other criteria.

* <<search-aggregations-pipeline,Pipeline>> aggregations that take input from
other aggregations instead of documents or fields.

[discrete]
[[run-an-agg]]
=== Run an aggregation

You can run aggregations as part of a <<search-your-data,search>> by specifying the <<search-search,search API>>'s `aggs` parameter. The
following search runs a
<<search-aggregations-bucket-terms-aggregation,terms aggregation>> on
`my-field`:

[source,console]
----
GET /my-index-000001/_search
{
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      }
    }
  }
}
----
// TEST[setup:my_index]
// TEST[s/my-field/http.request.method/]

Aggregation results are in the response's `aggregations` object:

[source,console-result]
----
{
  "took": 78,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 5,
      "relation": "eq"
    },
    "max_score": 1.0,
    "hits": [...]
  },
  "aggregations": {
    "my-agg-name": {                           <1>
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": []
    }
  }
}
----
// TESTRESPONSE[s/"took": 78/"took": "$body.took"/]
// TESTRESPONSE[s/\.\.\.$/"took": "$body.took", "timed_out": false, "_shards": "$body._shards", /]
// TESTRESPONSE[s/"hits": \[\.\.\.\]/"hits": "$body.hits.hits"/]
// TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":"get","doc_count":5\}\]/]

<1> Results for the `my-agg-name` aggregation.

[discrete]
[[change-agg-scope]]
=== Change an aggregation's scope

Use the `query` parameter to limit the documents on which an aggregation runs:

[source,console]
----
GET /my-index-000001/_search
{
  "query": {
    "range": {
      "@timestamp": {
        "gte": "now-1d/d",
        "lt": "now/d"
      }
    }
  },
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      }
    }
  }
}
----
// TEST[setup:my_index]
// TEST[s/my-field/http.request.method/]

[discrete]
[[return-only-agg-results]]
=== Return only aggregation results

By default, searches containing an aggregation return both search hits and
aggregation results. To return only aggregation results, set `size` to `0`:

[source,console]
----
GET /my-index-000001/_search
{
  "size": 0,
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      }
    }
  }
}
----
// TEST[setup:my_index]
// TEST[s/my-field/http.request.method/]

[discrete]
[[run-multiple-aggs]]
=== Run multiple aggregations

You can specify multiple aggregations in the same request:

[source,console]
----
GET /my-index-000001/_search
{
  "aggs": {
    "my-first-agg-name": {
      "terms": {
        "field": "my-field"
      }
    },
    "my-second-agg-name": {
      "avg": {
        "field": "my-other-field"
      }
    }
  }
}
----
// TEST[setup:my_index]
// TEST[s/my-field/http.request.method/]
// TEST[s/my-other-field/http.response.bytes/]

[discrete]
[[run-sub-aggs]]
=== Run sub-aggregations

Bucket aggregations support bucket or metric sub-aggregations. For example, a
terms aggregation with an <<search-aggregations-metrics-avg-aggregation,avg>>
sub-aggregation calculates an average value for each bucket of documents. There
is no level or depth limit for nesting sub-aggregations.

[source,console]
----
GET /my-index-000001/_search
{
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      },
      "aggs": {
        "my-sub-agg-name": {
          "avg": {
            "field": "my-other-field"
          }
        }
      }
    }
  }
}
----
// TEST[setup:my_index]
// TEST[s/_search/_search?size=0/]
// TEST[s/my-field/http.request.method/]
// TEST[s/my-other-field/http.response.bytes/]

The response nests sub-aggregation results under their parent aggregation:

[source,console-result]
----
{
  ...
  "aggregations": {
    "my-agg-name": {                           <1>
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "foo",
          "doc_count": 5,
          "my-sub-agg-name": {                 <2>
            "value": 75.0
          }
        }
      ]
    }
  }
}
----
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
// TESTRESPONSE[s/"key": "foo"/"key": "get"/]
// TESTRESPONSE[s/"value": 75.0/"value": $body.aggregations.my-agg-name.buckets.0.my-sub-agg-name.value/]

<1> Results for the parent aggregation, `my-agg-name`.
<2> Results for `my-agg-name`'s sub-aggregation, `my-sub-agg-name`.

[discrete]
[[add-metadata-to-an-agg]]
=== Add custom metadata

Use the `meta` object to associate custom metadata with an aggregation:

[source,console]
----
GET /my-index-000001/_search
{
  "aggs": {
    "my-agg-name": {
      "terms": {
        "field": "my-field"
      },
      "meta": {
        "my-metadata-field": "foo"
      }
    }
  }
}
----
// TEST[setup:my_index]
// TEST[s/_search/_search?size=0/]

The response returns the `meta` object in place:

[source,console-result]
----
{
  ...
  "aggregations": {
    "my-agg-name": {
      "meta": {
        "my-metadata-field": "foo"
      },
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": []
    }
  }
}
----
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]

[discrete]
[[return-agg-type]]
=== Return the aggregation type

By default, aggregation results include the aggregation's name but not its type.
To return the aggregation type, use the `typed_keys` query parameter.

[source,console]
----
GET /my-index-000001/_search?typed_keys
{
  "aggs": {
    "my-agg-name": {
      "histogram": {
        "field": "my-field",
        "interval": 1000
      }
    }
  }
}
----
// TEST[setup:my_index]
// TEST[s/typed_keys/typed_keys&size=0/]
// TEST[s/my-field/http.response.bytes/]

The response returns the aggregation type as a prefix to the aggregation's name.

IMPORTANT: Some aggregations return a different aggregation type from the
type in the request. For example, the terms,
<<search-aggregations-bucket-significantterms-aggregation,significant terms>>,
and <<search-aggregations-metrics-percentile-aggregation,percentiles>>
aggregations return different aggregations types depending on the data type of
the aggregated field.

[source,console-result]
----
{
  ...
  "aggregations": {
    "histogram#my-agg-name": {                 <1>
      "buckets": []
    }
  }
}
----
// TESTRESPONSE[s/\.\.\./"took": "$body.took", "timed_out": false, "_shards": "$body._shards", "hits": "$body.hits",/]
// TESTRESPONSE[s/"buckets": \[\]/"buckets":\[\{"key":1070000.0,"doc_count":5\}\]/]

<1> The aggregation type, `histogram`, followed by a `#` separator and the aggregation's name, `my-agg-name`.

[discrete]
[[use-scripts-in-an-agg]]
=== Use scripts in an aggregation

When a field doesn't exactly match the aggregation you need, you
should aggregate on a <<runtime,runtime field>>:

[source,console]
----
GET /my-index-000001/_search?size=0
{
  "runtime_mappings": {
    "message.length": {
      "type": "long",
      "script": "emit(doc['message.keyword'].value.length())"
    }
  },
  "aggs": {
    "message_length": {
      "histogram": {
        "interval": 10,
        "field": "message.length"
      }
    }
  }
}
----
// TEST[setup:my_index]

////
[source,console-result]
----
{
  "timed_out": false,
  "took": "$body.took",
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0,
    "skipped": 0
  },
  "hits": "$body.hits",
  "aggregations": {
    "message_length": {
      "buckets": [
        {
          "key": 30.0,
          "doc_count": 5
        }
      ]
    }
  }
}
----
////

Scripts calculate field values dynamically, which adds a little
overhead to the aggregation. In addition to the time spent calculating,
some aggregations like <<search-aggregations-bucket-terms-aggregation,`terms`>>
and <<search-aggregations-bucket-filters-aggregation,`filters`>> can't use
some of their optimizations with runtime fields. In total, performance costs
for using a runtime field varies from aggregation to aggregation.

// TODO when we have calculated fields we can link to them here.

[discrete]
[[agg-caches]]
=== Aggregation caches

For faster responses, {es} caches the results of frequently run aggregations in
the <<shard-request-cache,shard request cache>>. To get cached results, use the
same <<shard-and-node-preference,`preference` string>> for each search. If you
don't need search hits, <<return-only-agg-results,set `size` to `0`>> to avoid
filling the cache.

{es} routes searches with the same preference string to the same shards. If the
shards' data doesn’t change between searches, the shards return cached
aggregation results.

[discrete]
[[limits-for-long-values]]
=== Limits for `long` values

When running aggregations, {es} uses <<number,`double`>> values to hold and
represent numeric data. As a result, aggregations on <<number,`long`>> numbers
greater than +2^53^+ are approximate.
--

include::aggregations/bucket.asciidoc[]

include::aggregations/metrics.asciidoc[]

include::aggregations/pipeline.asciidoc[]