Commit graph

162 commits

Author SHA1 Message Date
Lee Hinman
91bdfb84a0
Clarify data stream recommendations and best practices (#107233)
* Clarify data stream recommendations and best practices

Our documentation around data streams versus aliases could be interpreted in a way where someone doing *any* updates thinks they need to use an alias with indices instead of a data stream. This commit enhances the documentation around these areas to determine the correct abstraction in a more concrete way. It also tries to clarify that data streams still allow updates to the backing indices, and that a difference is last-write-wins versus first-write-wins.
2024-04-08 13:41:53 -06:00
Mary Gouseti
2122da31cd
[DSL] Introduce data stream global retention - Part 3 (#105682)
In this PR we introduce the API that will expose the global retention configuration and will allow users to take advantage of it.

These APIs are protected by the dedicated introduced privileges:

`manage_data_stream_global_retention` or higher, which allows all operations on the global retention configuration
`monitor_data_stream_retention` or higher, which allows the retrieval of the global retention configuration.

This PR is the final PR that makes the global retention available for our users.
2024-03-28 10:40:33 +02:00
Mary Gouseti
2988799079
[DSL Global Retention] Use data stream global retention metadata (#106221) 2024-03-20 20:27:08 +02:00
Joe Gallo
38168407ef
Docs typo fix (#105835) (#106002)
Co-authored-by: MikhailBerezhanov <35196259+MikhailBerezhanov@users.noreply.github.com>
2024-03-06 07:45:13 -05:00
Kostas Krikellas
c4c2ce83cb
Downsampling supports date_histogram with tz (#103511)
* Downsampling supports date_histogram with tz

This comes with caveats, for downsampled indexes at intervals more than
15 minutes. For instance,
 - 1-hour downsampling will produce inaccurate
results for 1-hour histograms on timezones shifted by XX:30
 - 1-day downsampling will produce inaccurate daily
histograms for not-UTC timezones as it tracks days at UTC.

Related to #101309

* Update docs/changelog/103511.yaml

* test daylight savings

* update documentation

* Offset time buckets over downsampled data with TZ

* Update docs/changelog/103511.yaml

* check for TSDS

* fixme for transport version

* add interval to index metadata

* add transport version

* bump up transport version

* address feedbcak

* spotless fix
2024-01-16 10:27:33 +02:00
Martijn van Groningen
4b8d99252d
Update documentation around index.look_ahead_time setting. (#103975)
Adjusted the default after #103898
2024-01-10 09:48:17 +01:00
Mary Gouseti
046cdeae23
Introduce lazy rollover for mapping updates in data streams (#103309)
In this PR we implement the idea to introduce a flag, that a data stream needs to be rolloved over before the next document is indexed.
2024-01-08 15:07:16 +02:00
Martijn van Groningen
842303cd7f
Lower the look_ahead_time setting's maximum value. (#103434)
Initially the index.look_head_time was both used to define the index.time_series.start_time and index.time_series.end_time.

The former is now controlled by index.look_back_time and the maximum value of 7 days for index.look_ahead_time is too generous. As it also delays data being indexed to new index after rollover by up to 7 days.

This PR changes the index.look_ahead_time setting's maximum allowed value from 7 days to 2 hours, which is equal to the index.look_ahead_time setting's default. A look ahead time of 2 hours is high enough to accept data that is ahead of the current time, but avoids configuring the index.look_ahead_time setting to a too high value that causes rolled over indices to not receive writes for a very long period.

This is a breaking change, but configuring the index.look_ahead_time setting to a higher value than 2 hours will not fail. Instead 2 hours will be used a look ahead time.
2023-12-20 09:00:04 +01:00
Martijn van Groningen
c7021050f1
Slightly simplify setup tsds section (#103475)
* By not encouraging to use index.look_ahead index setting. The default should would well out of the box and changing this setting can cause tsds to not work correctly.
* Not mentioning the index.codec setting. This is a low level setting has no real benefit in case of tsds. And setting it to best compression can hurt performance without any real benefit.
2023-12-18 11:39:10 +01:00
Andrei Dan
17811280c2
[DOCS] DSL downsampling docs (#103148) 2023-12-08 06:52:18 -05:00
Andrei Dan
2212df73e8
[DOCS] migrate ILM to DSL headings and TLDR (#102068)
This adds some headings and a TL;DR section to the migration to DSL
tutorial.
2023-11-23 06:37:16 -05:00
Mary Gouseti
5a3409b7c5
ES-6566: [DSL] Introduce new endpoint to expose data stream lifecycle stats (#101845) 2023-11-20 10:38:41 +02:00
James Baiera
6fa7f60073
Add ability to create a data stream failure store (#99134)
Adds the ability to configure a data stream to create a new kind of backing index called a failure store which will eventually be used to store error information when ingest pipelines fail to ingest a document or when a document fails to be parsed correctly by the configured mapping on the data stream.
2023-11-15 15:32:51 -05:00
Andrei Dan
6054a5eb18
[DOCS] Fix typo (#101791) 2023-11-13 06:07:03 -05:00
Andrei Dan
7b436bae2c
[DOCS] DSL: More visible tech preview tags (#101313) 2023-10-26 12:06:15 +01:00
Andrei Dan
74ea04fb2d
[DOCS] document tail merging and create tutorial for migrating to DSL (#101117)
This documents tail merging, the enabled flag, and
adds a tutorial to migrate a data stream from ILM to DSL.
2023-10-25 11:12:36 +01:00
Martijn van Groningen
311185311f
Remove index.codec setting from setting up tsdb docs. (#101276)
This is not needed for tsdb, because of synthetic source and slows down indexing / refreshes.
2023-10-25 08:21:18 +02:00
Andrei Dan
632c97b234
Document ILM waits for tsds end_time to lapse in some actions (#100204) 2023-10-04 07:55:58 -04:00
Andrei Dan
839afdc331
Promote the Data stream lifecycle feature to Technical Preview (#100187)
This releases the Data stream lifecycle feature as a
Technical Preview feature.

Data stream lifecycle, albeit in technical preview, will allow data streams
to take advantage of a native simplified and resilient lifecycle implementation.
2023-10-03 17:12:35 +01:00
Andrei Dan
1369ff2b78
Remove managing ds by default for now (#100149)
This removes the DSL functionality that would automatically
configure the lifecycle to all new data streams in preparation
for marking Data stream lifecycle as ready for Technical Preview.
2023-10-02 20:28:31 +01:00
Andrei Dan
f202ad02fe
GET _data_stream displays both ILM and DSL information (#99947)
This add support to the `GET _data_stream` API for displaying the value
of the `index.lifecycle.prefer_ilm` setting both at the backing index
level and at the top level (top level meaning, similarly to the existing
`ilm_policy` field, the value in the index template that's backing the
data stream), an `ilm_policy` field for each backing index displaying
the actual ILM policy configured for the index itself, a `managed_by`
field for each backing index indicating who manages this index (the
possible values are: `Index Lifecycle Management`, `Data stream
lifecycle`, and `Unmanaged`).

This also adds a top level field to indicate which system would manage
the next generation index for this data stream based on the current
configuration. This field is called `next_generation_managed_by` and the
same values as the indices level `managed_by` field has are available.

An example output for a data stream that has 2 backing indices managed
by ILM and the write index by DSL:

```
{
	"data_streams": [{
		"name": "datastream-psnyudmbitp",
		"timestamp_field": {
			"name": "@timestamp"
		},
		"indices": [{
			"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000001",
			"index_uuid": "kyw0WEXvS8-ahchYS10NRQ",
                        "prefer_ilm": true,
			"ilm_policy": "policy-uVBEI",
			"managed_by": "Index Lifecycle Management"
		}, {
			"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000002",
			"index_uuid": "pDLdc4DERwO54GRzDr4krw",
			"prefer_ilm": true,
			"ilm_policy": "policy-uVBEI",
			"managed_by": "Index Lifecycle Management"
		}, {
			"index_name": ".ds-datastream-psnyudmbitp-2023.09.27-000003",
			"index_uuid": "gYZirLKcS3mlc1c3oHRpYw",
			"prefer_ilm": false,
			"ilm_policy": "policy-uVBEI",
                        "managed_by": "Data stream lifecycle"
		}],
		"generation": 3,
		"status": "YELLOW",
		"template": "indextemplate-obcvkbjqand",
		"lifecycle": {
			"enabled": true,
			"data_retention": "90d"
		},
		"ilm_policy": "policy-uVBEI",
                "next_generation_managed_by": "Data stream lifecycle",
		"prefer_ilm": false,
		"hidden": false,
		"system": false,
		"allow_custom_routing": false,
		"replicated": false
	}]
}
```
2023-09-28 13:48:17 -04:00
James Rodewig
ed8ea1f206
[main] [DOCS] Time series indices support non-metric/dimension fields (#99709) (#99811)
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Co-authored-by: Gilad Gal <gilad.gal@elastic.co>
2023-09-22 09:11:29 -04:00
Kostas Krikellas
b1da97af17
Document how to reindex a TSDS (#99476)
* Document how to reindex a TSDS

Time-series data streams require updating start and end times in the
destination index template, to avoid errors during copying of older
docs.

* Update docs/changelog/99476.yaml

* Spotless fix.

* Refresh indexes in unittest.

* Fix typo.

* Delete docs/changelog/99476.yaml

* Fix page link name.

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

* Update docs/reference/data-streams/tsds-reindex.asciidoc

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>

---------

Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
2023-09-13 18:28:03 +03:00
James Rodewig
255c9a7f95
[DOCS] Move x-pack docs to docs/reference dir (#99209)
**Problem:**
For historical reasons, source files for the Elasticsearch Guide's security, watcher, and Logstash API docs are housed in the `x-pack/docs` directory. This can confuse new contributors who expect Elasticsearch Guide docs to be located in `docs/reference`. 

**Solution:**
- Move the security, watcher, and Logstash API doc source files to the `docs/reference` directory
- Update doc snippet tests to use security

Rel: https://github.com/elastic/platform-docs-team/issues/208
2023-09-12 14:53:41 -04:00
Martijn van Groningen
3e3ee42589
Add index.look_back_time setting for tsdb data streams (#98518)
This change adds a `index.look_back_time` index setting that sets the `index.time_series.start_time` setting for the first backing index when a data stream is created.

This allows accepting data that is older for initial indexing without changing the `index.look_ahead_time` setting. This setting also controls the `index.time_series.end_time` setting and would affect rollovers as well.

The default for the `index.look_back_time` is `2h`, which means documents with `@timestamp` up to 2 hours after creation of the data stream are allowed to be indexed. This is the same as is without this change, because `index.look_ahead_time` is used to set `index.time_series.start_time` of the first backing index.

Closes #98463
2023-09-08 11:11:43 +02:00
Mary Gouseti
b9b818e28e
Allow explain data stream lifecycle to accept a data stream. (#98811)
Currently the `GET target/_lifecycle/explain` API only works for
indices. In this PR we extend this behaviour to allow the target to be a
data stream so we can get the overview lifecycle status for all the
backing indices of a data stream.
2023-08-24 06:29:09 -04:00
Andrei Dan
01ed7de99f
GA the data stream lifecycle (#98644)
This makes the data stream lifecycle generally available. This will allow
data streams to take advantage of a native simplified and resilient
lifecycle implementation.
2023-08-21 17:28:54 +01:00
Mary Gouseti
e71ea6e6d7
Add data stream lifecycle by default (#97823)
In this PR we enable all new data streams to be managed by the data
stream lifecycle by default. This is implemented by adding an empty
`lifecycle: {}` upon new data stream creation. 

Opting out is represented by a the `enabled` flag:

```
{
  "lifecycle": {
    "enabled": false
  }
}
```

This change has the following implications on when is an index managed
and by which feature:

| Parent data stream lifecycle| ILM| `prefer_ilm`|Managed by|
|----------------------------|----|----------------|-| | default | yes|
true| ILM| | default | yes| false| data stream lifecycle| |default |
no|true/false|data stream lifecycle| |opt-out or
missing|yes|true/false|ILM| |opt-out or missing|no|true/false|unmanaged|

Data streams that have been created before the data stream lifecycle is
enabled will not have the default lifecycle.

Next steps: - We need to document this when the feature will be GA
(https://github.com/elastic/elasticsearch/issues/97973).
2023-08-11 06:28:37 -04:00
Keith Massey
841050043e
Hiding data stream lifecycle documentation in released docs (#98334) 2023-08-10 08:18:05 -05:00
Salvatore Campagna
d0b2f650df
Enable all remaining metric aggregations on counters (#97974)
Here we enable aggregations previously not allowed on fields of type counter.
The decision of enabling such aggregations even if the result is "meaningless"
for counters has been taken to favour TSDB adoption.

Aggregations now allowed, other than the existing ones, include:
* avg
* box plot
* cardinality
* extended stats
* median absolute deviation
* percentile ranks
* percentiles
* stats
* sum
* value count

I included tests for the weighted average and matrix stats aggregations too.

Resolves #97882
2023-08-08 17:47:47 +02:00
Abdon Pijpelink
91d0e11ab9
[DOCS] Update manual downsampling documentation to use TSDS (#97976)
* [DOCS] Update manual downsampling documentation to use TSDS

* Swap manual and ILM downsampling examples in nav

* Typo

* Update prerequisites based on review feedback

* Warn against deleting the old backing index.

* Clarify counter/gauge results

* Mention that the downsampled type is 'aggregate_metric_double'
2023-08-04 09:39:14 +02:00
Abdon Pijpelink
5947f3b455
[DOCS] Clarify TSDS/synthetic source/runtime field restrictions (#97980) 2023-08-03 18:28:08 +02:00
Mary Gouseti
09d396a91f
Change test tear down to only remove resources created by the test (#98060) 2023-07-31 17:23:37 +03:00
Martijn van Groningen
bea09c004e
Explain tsdb counters better. (#97618) 2023-07-13 17:15:17 +02:00
Abdon Pijpelink
a9b3d7ada7
[DOCS] Clarify that disk impact of TSDS varies per data set (#97571) 2023-07-13 10:14:09 +02:00
Mary Gouseti
a432313ff3
Data stream lifecycle class names (#97381) 2023-07-05 12:28:32 +03:00
Mary Gouseti
1abd51b167
Start with data stream lifecycle documentation (#95326) 2023-06-28 16:18:05 +03:00
Martijn van Groningen
d5f9e113a5
Update tsds-index-settings.asciidoc (#96366)
* index.routing_path only gets generated for backing indices of tsdb data streams.
* Updated the dimension_fields.limit setting default.

Closes #96330
2023-05-26 19:23:53 +02:00
David Kilfoyle
8e7d4b0750
[DOCS] Note limits for queries on downsampled indices (#95749) 2023-05-03 09:58:23 -04:00
Martijn van Groningen
49e8ee4269
Remove remaining tsdb tech preview labels (#95563)
Remove tech preview label from a number of tsdb settings and mapping attributes.
2023-04-26 12:11:03 +02:00
Pablo Alcantar Morales
132290f8a3
fix flaky docs tests get-lifecycle (#95529)
The `data-streams/downsampling.asciidoc` test was missing a teardown clean of the ILM policies created. Due to this tests *do not have* the string `ilm` in its name, the automatic teardown process that cleans up the resources (check `ESRestTestCase.java#L815` & `DocsClientYamlTestSuiteIT.java` lines 177 & 195) is not executed for this specific test. In the case this test runs right before the `get-lifecycle` test, the policy won't be automatically deleted hence the test checking the version will fail. Finally, the order of execution of the test is not guaranteed by the suite.
2023-04-26 12:10:49 +02:00
Salvatore Campagna
ec2bdee31b
Add time_series_dimensions param to flattened docs (#95374) 2023-04-20 10:58:12 +02:00
Andrei Dan
7b994ba8d0
Document that DS backing indices can have gaps in the name counter (#95237) 2023-04-17 17:11:05 +01:00
Martijn van Groningen
b41f096756
Document counter field limitation. (#95155)
As is listed here:
https://github.com/elastic/elasticsearch/issues/93539#issuecomment-1420473031

Relates to #93539
2023-04-11 12:14:20 -04:00
Abdon Pijpelink
20dd5d3191
[DOCS] Rerun agg at the end of manual downsampling example (#95122)
* [DOCS] Rerun agg at the end of manual downsampling example

* Replace 'index' with 'data stream'

* One more 'index'
2023-04-11 17:33:52 +02:00
David Kilfoyle
9a673ad7f1
95017 fix downsampling step (#95054)
* Remove extra step in manual downsampling docs

* create -> view
2023-04-05 10:09:03 -04:00
Abdon Pijpelink
ccc2d94baf
[DOCS] Explain how to change aliases in data streams documentation (#94110) 2023-03-21 15:34:00 +01:00
Abdon Pijpelink
6e21e4a600
Update TSDS disk space reduction percentage (#94549) 2023-03-20 15:19:58 +01:00
richcollier
1da70cbe08
Update tsds.asciidoc (#94208)
fixed typo
2023-03-01 11:05:15 +01:00
Martijn van Groningen
92f229d643
Update tsdb docs to include warning and additional limitations (#93191)
Update tsdb docs to include a warning that the format of the `_tsid` field shouldn't be relied upon and added additional limitations about dimension fields.
2023-01-24 12:05:50 +01:00