elasticsearch/docs/reference/esql/processing-commands/stats.asciidoc
Nik Everett c73b3b6403
ESQL DOCS: speed note on stats (#101318)
Grouping by no fields is much much faster than grouping by one field.
Grouping by one field is like 5x faster than grouping on two fields.
2023-10-25 10:31:12 -04:00

53 lines
1.6 KiB
Text

[discrete]
[[esql-stats-by]]
=== `STATS ... BY`
Use `STATS ... BY` to group rows according to a common value and calculate one
or more aggregated values over the grouped rows.
[source.merge.styled,esql]
----
include::{esql-specs}/docs.csv-spec[tag=stats]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/docs.csv-spec[tag=stats-result]
|===
If `BY` is omitted, the output table contains exactly one row with the
aggregations applied over the entire dataset:
[source.merge.styled,esql]
----
include::{esql-specs}/docs.csv-spec[tag=statsWithoutBy]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/docs.csv-spec[tag=statsWithoutBy-result]
|===
It's possible to calculate multiple values:
[source,esql]
----
include::{esql-specs}/docs.csv-spec[tag=statsCalcMultipleValues]
----
It's also possible to group by multiple values (only supported for long and
keyword family fields):
[source,esql]
----
include::{esql-specs}/docs.csv-spec[tag=statsGroupByMultipleValues]
----
The following aggregation functions are supported:
include::../functions/aggregation-functions.asciidoc[tag=agg_list]
NOTE: `STATS` without any groups is much much faster than adding group.
NOTE: Grouping on a single field is currently much more optimized than grouping
on many fields. In some tests we've seen grouping on a single `keyword`
field to be five times faster than grouping on two `keyword` fields. Don't
try to work around this combining the two fields together with something
like <<esql-concat>> and then grouping - that's not going to be faster.