mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 23:57:20 -04:00
64 lines
No EOL
3.6 KiB
Text
64 lines
No EOL
3.6 KiB
Text
[role="xpack"]
|
|
[[data-stream-lifecycle]]
|
|
== Data stream lifecycle
|
|
|
|
preview::[]
|
|
|
|
A data stream lifecycle is the built-in mechanism data streams use to manage their lifecycle. It enables you to easily
|
|
automate the management of your data streams according to your retention requirements. For example, you could configure
|
|
the lifecycle to:
|
|
|
|
* Ensure that data indexed in the data stream will be kept at least for the retention time you defined.
|
|
* Ensure that data older than the retention period will be deleted automatically by {es} at a later time.
|
|
|
|
To achieve that, it supports:
|
|
|
|
* Automatic <<index-rollover,rollover>>, which chunks your incoming data in smaller pieces to facilitate better performance
|
|
and backwards incompatible mapping changes.
|
|
* Configurable retention, which allows you to configure the time period for which your data is guaranteed to be stored.
|
|
{es} is allowed at a later time to delete data older than this time period.
|
|
|
|
[discrete]
|
|
[[data-streams-lifecycle-how-it-works]]
|
|
=== How does it work?
|
|
|
|
In intervals configured by <<data-streams-lifecycle-poll-interval,`data_streams.lifecycle.poll_interval`>>, {es} goes over
|
|
each data stream and performs the following steps:
|
|
|
|
1. Checks if the data stream has a data stream lifecycle configured, skipping any indices not part of a managed data stream.
|
|
2. Rolls over the write index of the data stream, if it fulfills the conditions defined by
|
|
<<cluster-lifecycle-default-rollover,`cluster.lifecycle.default.rollover`>>.
|
|
3. Applies retention to the remaining backing indices. This means deleting the backing indices whose
|
|
`generation_time` is longer than the configured retention period. The `generation_time` is only applicable to rolled over backing
|
|
indices and it is either the time since the backing index got rolled over, or the time optionally configured in the
|
|
<<index-data-stream-lifecycle-origination-date,`index.lifecycle.origination_date`>> setting.
|
|
|
|
IMPORTANT: We use the `generation_time` instead of the creation time because this ensures that all data in the backing
|
|
index have passed the retention period. As a result, the retention period is not the exact time data gets deleted, but
|
|
the minimum time data will be stored.
|
|
|
|
NOTE: The steps `2` and `3` apply only to backing indices that are not already managed by {ilm-init}, meaning that these indices either do
|
|
not have an {ilm-init} policy defined, or if they do, they have <<index-lifecycle-prefer-ilm,`index.lifecycle.prefer_ilm`>>
|
|
set to `false`.
|
|
|
|
[discrete]
|
|
[[data-stream-lifecycle-configuration]]
|
|
=== Configuring data stream lifecycle
|
|
|
|
Since the lifecycle is configured on the data stream level, the process to configure a lifecycle on a new data stream and
|
|
on an existing one differ.
|
|
|
|
In the following sections, we will go through the following tutorials:
|
|
|
|
* To create a new data stream with a lifecycle, you need to add the data stream lifecycle as part of the index template
|
|
that matches the name of your data stream (see <<tutorial-manage-new-data-stream>>). When a write operation
|
|
with the name of your data stream reaches {es} then the data stream will be created with the respective data stream lifecycle.
|
|
* To update the lifecycle of an existing data stream you need to use the <<data-stream-lifecycle-api, data stream lifecycle APIs>>
|
|
to edit the lifecycle on the data stream itself (see <<tutorial-manage-existing-data-stream>>).
|
|
|
|
NOTE: Updating the data stream lifecycle of an existing data stream is different from updating the settings or the mapping,
|
|
because it is applied on the data stream level and not on the individual backing indices.
|
|
|
|
include::tutorial-manage-new-data-stream.asciidoc[]
|
|
|
|
include::tutorial-manage-existing-data-stream.asciidoc[] |