elasticsearch/docs/reference/aggregations/bucket/time-series-aggregation.asciidoc
Jihyun(Brian) Jeong e1207398c7
(Doc+) Clarify dimension field requirements for time_series aggregation (#119442)
* (Doc+) Clarify dimension field requirements for time_series aggregation

👋 howdy, team!

This PR adds a note explaining that time series indices require:
- index.mode set to "time_series"
- at least one dimension field with time_series_dimension: true
- a routing_path array listing those dimension fields

Without these settings, the time_series aggregation may return empty buckets or behave unexpectedly. By emphasizing the dimension field requirement, we help users configure their time series indices correctly and see meaningful aggregation results.

* Apply suggestions from code review

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>

---------

Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2025-01-29 13:03:11 +01:00

115 lines
3.8 KiB
Text

[[search-aggregations-bucket-time-series-aggregation]]
=== Time series aggregation
++++
<titleabbrev>Time series</titleabbrev>
++++
preview::[]
The time series aggregation queries data created using a <<tsds,Time series data stream (TSDS)>>. This is typically data such as metrics
or other data streams with a time component, and requires creating an index using the time series mode.
[NOTE]
====
Refer to the <<differences-from-regular-data-stream, TSDS documentation>> to learn more about the key differences from regular data streams.
====
//////////////////////////
Creating a time series mapping
To create an index with the time series mapping, specify "mode" as "time_series" in the index settings,
"routing_path" specifying the a list of time series fields, and a start and end time for the series. Each of the
"routing_path" fields must be keyword fields with "time_series_dimension" set to true. Additionally, add a
date field used as the timestamp.
[source,js]
--------------------------------------------------
PUT /my-time-series-index
{
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 2,
"mode": "time_series",
"routing_path": ["key"],
"time_series": {
"start_time": "2022-01-01T00:00:00Z",
"end_time": "2023-01-01T00:00:00Z"
}
}
},
"mappings": {
"properties": {
"key": {
"type": "keyword",
"time_series_dimension": true
},
"@timestamp": {
"type": "date"
}
}
}
}
-------------------------------------------------
// NOTCONSOLE
//////////////////////////
Data can be added to the time series index like other indices:
[source,js]
--------------------------------------------------
PUT /my-time-series-index-0/_bulk
{ "index": {} }
{ "key": "a", "val": 1, "@timestamp": "2022-01-01T00:00:10Z" }
{ "index": {}}
{ "key": "a", "val": 2, "@timestamp": "2022-01-02T00:00:00Z" }
{ "index": {} }
{ "key": "b", "val": 2, "@timestamp": "2022-01-01T00:00:10Z" }
{ "index": {}}
{ "key": "b", "val": 3, "@timestamp": "2022-01-02T00:00:00Z" }
--------------------------------------------------
// NOTCONSOLE
To perform a time series aggregation, specify "time_series" as the aggregation type. When the boolean "keyed"
is true, each bucket is given a unique key.
[source,js,id=time-series-aggregation-example]
--------------------------------------------------
GET /_search
{
"aggs": {
"ts": {
"time_series": { "keyed": false }
}
}
}
--------------------------------------------------
// NOTCONSOLE
This will return all results in the time series, however a more typical query will use sub aggregations to reduce the
date returned to something more relevant.
[[search-aggregations-bucket-time-series-aggregation-size]]
==== Size
By default, `time series` aggregations return 10000 results. The "size" parameter can be used to limit the results
further. Alternatively, using sub aggregations can limit the amount of values returned as a time series aggregation.
[[search-aggregations-bucket-time-series-aggregation-keyed]]
==== Keyed
The `keyed` parameter determines if buckets are returned as a map with unique keys per bucket. By default with `keyed`
set to false, buckets are returned as an array.
[[times-series-aggregations-limitations]]
==== Limitations
The `time_series` aggregation has many limitations. Many aggregation performance optimizations are disabled when using
the `time_series` aggregation. For example the filter by filter optimization or collect mode breath first (`terms` and
`multi_terms` aggregation forcefully use the depth first collect mode).
The following aggregations also fail to work if used in combination with the `time_series` aggregation:
`auto_date_histogram`, `variable_width_histogram`, `rare_terms`, `global`, `composite`, `sampler`, `random_sampler` and
`diversified_sampler`.