Adds the ability to configure a data stream to create a new kind of backing index called a failure store which will eventually be used to store error information when ingest pipelines fail to ingest a document or when a document fails to be parsed correctly by the configured mapping on the data stream.
This releases the Data stream lifecycle feature as a
Technical Preview feature.
Data stream lifecycle, albeit in technical preview, will allow data streams
to take advantage of a native simplified and resilient lifecycle implementation.
This removes the DSL functionality that would automatically
configure the lifecycle to all new data streams in preparation
for marking Data stream lifecycle as ready for Technical Preview.
This add support to the `GET _data_stream` API for displaying the value
of the `index.lifecycle.prefer_ilm` setting both at the backing index
level and at the top level (top level meaning, similarly to the existing
`ilm_policy` field, the value in the index template that's backing the
data stream), an `ilm_policy` field for each backing index displaying
the actual ILM policy configured for the index itself, a `managed_by`
field for each backing index indicating who manages this index (the
possible values are: `Index Lifecycle Management`, `Data stream
lifecycle`, and `Unmanaged`).
This also adds a top level field to indicate which system would manage
the next generation index for this data stream based on the current
configuration. This field is called `next_generation_managed_by` and the
same values as the indices level `managed_by` field has are available.
An example output for a data stream that has 2 backing indices managed
by ILM and the write index by DSL:
```
{
"data_streams": [{
"name": "datastream-psnyudmbitp",
"timestamp_field": {
"name": "@timestamp"
},
"indices": [{
"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000001",
"index_uuid": "kyw0WEXvS8-ahchYS10NRQ",
"prefer_ilm": true,
"ilm_policy": "policy-uVBEI",
"managed_by": "Index Lifecycle Management"
}, {
"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000002",
"index_uuid": "pDLdc4DERwO54GRzDr4krw",
"prefer_ilm": true,
"ilm_policy": "policy-uVBEI",
"managed_by": "Index Lifecycle Management"
}, {
"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000003",
"index_uuid": "gYZirLKcS3mlc1c3oHRpYw",
"prefer_ilm": false,
"ilm_policy": "policy-uVBEI",
"managed_by": "Data stream lifecycle"
}],
"generation": 3,
"status": "YELLOW",
"template": "indextemplate-obcvkbjqand",
"lifecycle": {
"enabled": true,
"data_retention": "90d"
},
"ilm_policy": "policy-uVBEI",
"next_generation_managed_by": "Data stream lifecycle",
"prefer_ilm": false,
"hidden": false,
"system": false,
"allow_custom_routing": false,
"replicated": false
}]
}
```
**Problem:**
For historical reasons, source files for the Elasticsearch Guide's security, watcher, and Logstash API docs are housed in the `x-pack/docs` directory. This can confuse new contributors who expect Elasticsearch Guide docs to be located in `docs/reference`.
**Solution:**
- Move the security, watcher, and Logstash API doc source files to the `docs/reference` directory
- Update doc snippet tests to use security
Rel: https://github.com/elastic/platform-docs-team/issues/208
This change adds a `index.look_back_time` index setting that sets the `index.time_series.start_time` setting for the first backing index when a data stream is created.
This allows accepting data that is older for initial indexing without changing the `index.look_ahead_time` setting. This setting also controls the `index.time_series.end_time` setting and would affect rollovers as well.
The default for the `index.look_back_time` is `2h`, which means documents with `@timestamp` up to 2 hours after creation of the data stream are allowed to be indexed. This is the same as is without this change, because `index.look_ahead_time` is used to set `index.time_series.start_time` of the first backing index.
Closes#98463
Currently the `GET target/_lifecycle/explain` API only works for
indices. In this PR we extend this behaviour to allow the target to be a
data stream so we can get the overview lifecycle status for all the
backing indices of a data stream.
This makes the data stream lifecycle generally available. This will allow
data streams to take advantage of a native simplified and resilient
lifecycle implementation.
In this PR we enable all new data streams to be managed by the data
stream lifecycle by default. This is implemented by adding an empty
`lifecycle: {}` upon new data stream creation.
Opting out is represented by a the `enabled` flag:
```
{
"lifecycle": {
"enabled": false
}
}
```
This change has the following implications on when is an index managed
and by which feature:
| Parent data stream lifecycle| ILM| `prefer_ilm`|Managed by|
|----------------------------|----|----------------|-| | default | yes|
true| ILM| | default | yes| false| data stream lifecycle| |default |
no|true/false|data stream lifecycle| |opt-out or
missing|yes|true/false|ILM| |opt-out or missing|no|true/false|unmanaged|
Data streams that have been created before the data stream lifecycle is
enabled will not have the default lifecycle.
Next steps: - We need to document this when the feature will be GA
(https://github.com/elastic/elasticsearch/issues/97973).
Here we enable aggregations previously not allowed on fields of type counter.
The decision of enabling such aggregations even if the result is "meaningless"
for counters has been taken to favour TSDB adoption.
Aggregations now allowed, other than the existing ones, include:
* avg
* box plot
* cardinality
* extended stats
* median absolute deviation
* percentile ranks
* percentiles
* stats
* sum
* value count
I included tests for the weighted average and matrix stats aggregations too.
Resolves#97882
* [DOCS] Update manual downsampling documentation to use TSDS
* Swap manual and ILM downsampling examples in nav
* Typo
* Update prerequisites based on review feedback
* Warn against deleting the old backing index.
* Clarify counter/gauge results
* Mention that the downsampled type is 'aggregate_metric_double'
The `data-streams/downsampling.asciidoc` test was missing a teardown clean of the ILM policies created. Due to this tests *do not have* the string `ilm` in its name, the automatic teardown process that cleans up the resources (check `ESRestTestCase.java#L815` & `DocsClientYamlTestSuiteIT.java` lines 177 & 195) is not executed for this specific test. In the case this test runs right before the `get-lifecycle` test, the policy won't be automatically deleted hence the test checking the version will fail. Finally, the order of execution of the test is not guaranteed by the suite.
Update tsdb docs to include a warning that the format of the `_tsid` field shouldn't be relied upon and added additional limitations about dimension fields.
It seems that for now we don't have a good use for the histogram and summary metric types.
They had been left as place holders for a while, but at this point there is no concrete plan forward for them.
This PR removes the histogram and summary metric types. We may add them back in the future.
Also, this PR completely removes the time_series_metric mapping parameter from the histogram field type and only allows the gauge metric type for aggregate_metric_double fields.