Backporting #126637 to 8.18 branch.
If updating the `index.time_series.end_time` fails for one data stream,
then UpdateTimeSeriesRangeService should continue updating this setting for other data streams.
The following error was observed in the wild:
```
[2025-04-07T08:50:39,698][WARN ][o.e.d.UpdateTimeSeriesRangeService] [node-01] failed to update tsdb data stream end times
java.lang.IllegalArgumentException: [index.time_series.end_time] requires [index.mode=time_series]
at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:636) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:619) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.common.settings.Setting.get(Setting.java:563) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.common.settings.Setting.get(Setting.java:535) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.datastreams.UpdateTimeSeriesRangeService.updateTimeSeriesTemporalRange(UpdateTimeSeriesRangeService.java:111) ~[?:?]
at org.elasticsearch.datastreams.UpdateTimeSeriesRangeService$UpdateTimeSeriesExecutor.execute(UpdateTimeSeriesRangeService.java:210) ~[?:?]
at org.elasticsearch.cluster.service.MasterService.innerExecuteTasks(MasterService.java:1075) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:1038) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.cluster.service.MasterService.executeAndPublishBatch(MasterService.java:245) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.lambda$run$2(MasterService.java:1691) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.action.ActionListener.run(ActionListener.java:452) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.run(MasterService.java:1688) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.cluster.service.MasterService$5.lambda$doRun$0(MasterService.java:1283) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.action.ActionListener.run(ActionListener.java:452) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.cluster.service.MasterService$5.doRun(MasterService.java:1262) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023) ~[elasticsearch-8.17.3.jar:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[elasticsearch-8.17.3.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
at java.lang.Thread.run(Thread.java:1575) ~[?:?]
```
Which resulted in a situation, that causes the `index.time_series.end_time` index setting not being updated for any data stream. This then caused data loss as metrics couldn't be indexed, because no suitable backing index could be resolved:
```
the document timestamp [2025-03-26T15:26:10.000Z] is outside of ranges of currently writable indices [[2025-01-31T07:22:43.000Z,2025-02-15T07:24:06.000Z][2025-02-15T07:24:06.000Z,2025-03-02T07:34:07.000Z][2025-03-02T07:34:07.000Z,2025-03-10T12:45:37.000Z][2025-03-10T12:45:37.000Z,2025-03-10T14:30:37.000Z][2025-03-10T14:30:37.000Z,2025-03-25T12:50:40.000Z][2025-03-25T12:50:40.000Z,2025-03-25T14:35:40.000Z
```
* Bug Fix: System Data Streams Should Be Restorable (#124651)
This PR adds a new MetadataDeleteDataStreamService that allows us to delete system data streams prior to a restore operation. This fixes a bug where system data streams were previously un-restorable.
(cherry picked from commit cb3c35783b)
# Conflicts:
# modules/data-streams/src/main/java/org/elasticsearch/datastreams/action/DeleteDataStreamTransportAction.java
# server/src/main/java/org/elasticsearch/snapshots/RestoreService.java
# server/src/test/java/org/elasticsearch/cluster/metadata/MetadataDataStreamsServiceTests.java
* Fix test bug, mute test
This commit adds support for system data streams reindexing. The system data stream migration extends the existing system indices migration task and uses the data stream reindex API.
The system index migration task starts a reindex data stream task and tracks its status every second. Only one system index or system data stream is migrated at a time. If a data stream migration fails, the entire system index migration task will also fail.
* Fix Gradle Deprecation warning as declaring an is- property with a Boolean type has been deprecated.
* Make use of new layout.settingsFolder api to address some cross project references
* Fix buildParams snapshot check for multiprojet projects
(cherry picked from commit e19b2264af)
# Conflicts:
# build-tools-internal/gradle/wrapper/gradle-wrapper.properties
# build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/BaseInternalPluginBuildPlugin.java
# build-tools-internal/src/main/resources/minimumGradleVersion
# docs/build.gradle
# gradle/wrapper/gradle-wrapper.properties
# plugins/examples/gradle/wrapper/gradle-wrapper.properties
# qa/lucene-index-compatibility/build.gradle
# x-pack/qa/multi-project/core-rest-tests-with-multiple-projects/build.gradle
# x-pack/qa/multi-project/xpack-rest-tests-with-multiple-projects/build.gradle
* Rename environment dir accessors (#121803)
The node environment has many paths. The accessors for these currently use a "file" suffix, but they are always directories. This commit renames the accessors to make it clear these paths are directories.
* [CI] Auto commit changes from spotless
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
The message_id field may be unmapped if documents were indexed into some
indices but not all. This change specifies the unmapped type for
message_id, allowing it to be sorted in such cases.
Closes#120796
* Enable queryable built-in roles feature by default (#120323)
Making the `es.queryable_built_in_roles_enabled` feature flag enabled by default.
This feature makes the built-in roles automatically indexed in `.security` index and available
for querying via Query Role API. The consequence of this is that `.security` index is now
created eagerly (if it's not existing) on cluster formation.
In order to keep the scope of this PR small, the feature is disabled for some of the tests,
because they are either non-trivial to adjust or the gain is not worthy the effort to do it now.
The tests will be adjusted in a follow-up PR and later the flag will be removed completely.
Relates to #117581
(cherry picked from commit 52e0f21bdd)
# Conflicts:
# modules/dot-prefix-validation/build.gradle
# test/framework/src/main/java/org/elasticsearch/test/InternalTestCluster.java
# x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/authc/esnative/ReservedRealmElasticAutoconfigIntegTests.java
* Update InternalTestCluster.java
remove line snuck after resolving merge confilcs
* Update build.gradle
fix build.gradle
* Update build.gradle
fix build.gradle by removing invalid task
* remove non-existing timeout parameter on 8.x branch
In this PR we include the failure store indices while
disallowing the user to use selectors in the `indices` field of the
create and restore snapshot requests.
**Security concerns**
The create and restore snapshot require cluster privileges only, so we
do not have to worry about how to handle index level security (see
[Create Snapshot
API](https://www.elastic.co/guide/en/elasticsearch/reference/current/create-snapshot-api.html#create-snapshot-api-prereqs)
& [Restore Snapshot
API](https://www.elastic.co/guide/en/elasticsearch/reference/current/restore-snapshot-api.html#restore-snapshot-api-prereqs)).
**Challenges**
While there are other APIs that do not support selectors but they do
include the failure indices in their response such as `GET
_data_stream/<target>`, `GET _data_stream/<target>/stats`, etc; the
snapshot APIs are a little bit different. The reason for that is that
the data stream APIs first collect only the relevant data streams and
then they select the backing indices and/or the failure indices. On the
other hand, the snapshot API that works with both indices and data
streams cannot take such a shortcut and needs to use the
`concreteIndices` method.
We propose, to add a flag that when the selectors are not "allowed" can
determine if we need to include the failure store or not. In the past we
had something similar called the default selector, but it was more
flexible and it was used a fallback always.
Our goal now is to only specify the behaviour we want when the selectors
are not supported. This new flag allowed also to simplify the concrete
index resolution in `GET _data_stream/<target>/stats`
Relates to #119545
* Add selector syntax to index expressions (#118614)
This PR introduces a new syntactical feature to index expression resolution: The selector.
Selectors, denoted with a :: followed by a recognized suffix will allow users to specify which component of
an index abstraction they would like to operate on within an API call. In this case, an index abstraction is a
concrete index, data stream, or alias; Any abstraction that can be resolved to a set of indices/shards. We
define a component of an index abstraction to be some searchable unit of the index abstraction.
(cherry picked from commit c3839e1f76)
# Conflicts:
# modules/data-streams/src/internalClusterTest/java/org/elasticsearch/datastreams/IngestFailureStoreMetricsIT.java
# server/src/main/java/org/elasticsearch/TransportVersions.java
# server/src/test/java/org/elasticsearch/action/OriginalIndicesTests.java
* [CI] Auto commit changes from spotless
* Fixing compiler issues
* Remove feature flag from influencing the serialisation
* Only add failure indices when failure store flag is on
* Fix OriginalIndicesTests
* [CI] Auto commit changes from spotless
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>
Co-authored-by: Mary Gouseti <mary.gouseti@elastic.co>
Co-authored-by: Mary Gouseti <mgouseti@gmail.com>
* Metrics for indexing failures due to version conflicts (#119067)
This exposes new OTel node and index based metrics for indexing failures due to version conflicts.
In addition, the /_cat/shards, /_cat/indices and /_cat/nodes APIs also expose the same metric, under the newly added column iifvc.
Relates: #107601
(cherry picked from commit 12eb1cfda1)
# Conflicts:
# server/src/main/java/org/elasticsearch/TransportVersions.java
* types
* Fix NodeIndexingMetricsIT
* [CI] Auto commit changes from spotless
* Fix RestShardsActionTests
* Fix test/cat.shards/10_basic.yml for bwc
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
This updates the gradle wrapper to 8.12
We addressed deprecation warnings due to the update that includes:
- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions
(cherry picked from commit ba61f8c7f7)
This setting enables or disables the failure store for data streams
based on matching the data stream name against a list of patterns. It
acts as a default, and is overridden if the failure store is
explicitly enabled or disabled either in a component template or using
the data stream options API.
(See the PR for explanations of some of the changes here.)
(cherry picked from commit 97e6bb6ce3)
# Conflicts:
# server/src/main/java/org/elasticsearch/action/bulk/BulkOperation.java
# server/src/main/java/org/elasticsearch/node/NodeConstruction.java
In this PR we reconciliate the failure indices of a data stream just like we do for the backing indices. The only difference is that a data stream can have an empty list of failure indices, while it cannot have an empty list of backing indices.
An easy way to create a situation where certain backing or failure indices are not included in a snapshot is via using exclusions in the multi-target expression of the snapshot. For example:
```
PUT /_snapshot/my_repository/my-snapshot?wait_for_completion=true
{
"indices": "my-ds*", "-.fs-my-ds-000001"
}
```
Backport of #118834
In this PR we move the failure store configuration from the
`data_stream` in a composable index template to the `template` section
under the `data_stream_options`. The reason for this change is that we
want to be able to compose the failure store configuration via component
templates.
**Previous composable index template**
```
{
"index_patterns": [....],
"template": {
......
},
"data_stream": {
"failure_store": true
}
}
```
**New composable index template**
```
{
"index_patterns": [....],
"template": {
"data_stream_options": {
"failure_store": {
"enabled": true
}
},
......
},
"data_stream": {}
}
```
**Composition logic**
| Component Template A | Composable Index Template | Resolved data
stream options |
|-------------------------|-----------------------------|-------------------------------|
| `{"failure_store": {"enabled": true}}` | - |`{"failure_store":
{"enabled": true}}`| | `{"failure_store": {"enabled": true}}` |
`{"failure_store": {"enabled": false}}` |`{"failure_store": {"enabled":
false}}`| | - | `{"failure_store": {"enabled": true}}`
|`{"failure_store": {"enabled": true}}`| | `{"failure_store": null}` |
`{"failure_store": {"enabled": true}}` |`{"failure_store": {"enabled":
true}}`| | `{"failure_store": {"enabled": true}}` | `{"failure_store":
null}` |`{}`|
More context and discussions can be found in the comments of
#113863.
* Only aggregations require at least one shard request (#115314)
* unskipping shards only when aggs
* Update docs/changelog/115314.yaml
* fixed more tests
* null check for searchRequest.source()
(cherry picked from commit 7f573c6c28)
* applying #115774
* skipped test
* fixed test
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Here, we only need to extract the minimum and maximum values of the
timestamp field; therefore, using a stats searcher should suffice. This
is important for frozen indices.
* Run tsdb tests with both base and trial licenses.
* ignore license error in serverless
* update
* update
* update
(cherry picked from commit b97b6637a6)
**Introduction**
> In order to make adoption of failure stores simpler for all users, we
are introducing a new syntactical feature to index expression
resolution: The selector. > > Selectors, denoted with a :: followed by a
recognized suffix will allow users to specify which component of an
index abstraction they would like to operate on within an API call. In
this case, an index abstraction is a concrete index, data stream, or
alias; Any abstraction that can be resolved to a set of indices/shards.
We define a component of an index abstraction to be some searchable unit
of the index abstraction. > > To start, we will support two components:
data and failures. Concrete indices are their own data components, while
the data component for index aliases are all of the indices contained
therein. For data streams, the data component corresponds to their
backing indices. Data stream aliases mirror this, treating all backing
indices of the data streams they correspond to as their data component.
> > The failure component is only supported by data streams and data
stream aliases. The failure component of these abstractions refer to the
data streams' failure stores. Indices and index aliases do not have a
failure component.
For more details and examples see
https://github.com/elastic/elasticsearch/pull/113144. All this work has
been cherry picked from there.
**Purpose of this PR**
This PR is introducing the `::*` as another selector option and not as a
combination of `::data` and `::failure`. The reason for this change is
that we need to differentiate between:
- `my-index::*` which should resolve to `my-index::data` only and not to `my-index::failures` and
- a user explicitly requesting `my-index::data, my-index::failures` which should result potentially to an error.
In this PR we add a test and we fix the issues we encountered when we
enabled the failure store for TSDS and logsdb.
**Logsdb** Logsdb worked out of the box, so we just added the test that
indexes with a bulk request a couple of documents and tests how they are
ingested.
**TSDS** Here it was a bit trickier. We encountered the following
issues:
- TSDS requires a timestamp to determine the write index of the data stream meaning the failure happens earlier than we have anticipated so far. We added a special exception to detect this case and we treat it accordingly.
- The template of a TSDS data stream sets certain settings that we do not want to have in the failure store index. We added an allowlist that gets applied before we add the necessary index settings.
Furthermore, we added a test case to capture this.
In JDK 23 `Thread.resume` has been removed this means that we cannot use
`IntermittentLongGCDisruption` that depends on it.
We simulate the master node disruption with a `CyclicBarrier` that
blocks cluster state updates.
Closes: https://github.com/elastic/elasticsearch/issues/115045
The backport will close:
https://github.com/elastic/elasticsearch/issues/112634
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
**Introduction**
> In order to make adoption of failure stores simpler for all users, we
are introducing a new syntactical feature to index expression
resolution: The selector. > > Selectors, denoted with a :: followed by a
recognized suffix will allow users to specify which component of an
index abstraction they would like to operate on within an API call. In
this case, an index abstraction is a concrete index, data stream, or
alias; Any abstraction that can be resolved to a set of indices/shards.
We define a component of an index abstraction to be some searchable unit
of the index abstraction. > > To start, we will support two components:
data and failures. Concrete indices are their own data components, while
the data component for index aliases are all of the indices contained
therein. For data streams, the data component corresponds to their
backing indices. Data stream aliases mirror this, treating all backing
indices of the data streams they correspond to as their data component.
> > The failure component is only supported by data streams and data
stream aliases. The failure component of these abstractions refer to the
data streams' failure stores. Indices and index aliases do not have a
failure component.
For more details and examples see
https://github.com/elastic/elasticsearch/pull/113144. All this work has
been cherry picked from there.
**Purpose of this PR**
This PR is replacing the `FailureStoreOptions` with the
`SelectorOptions`, there shouldn't be any perceivable change to the user
since we kept the query parameter "failure_store" for now. It will be
removed in the next PR which will introduce the parsing of the
expressions.
_The current PR is just a refactoring and does not and should not change
any existing behaviour._
With logsdb another index mode is available, the isTimeSeries parameter is limiting. Instead, we should just push down the index mode from template to index settings provider.
Follow up from #113451
Relates to #113583
In this PR we introduce two endpoint PUT and GET to manage the data
stream options and consequently the failure store configuration on the
data stream level. This means that we can manage the failure store of
existing data streams.
The APIs look like:
```
# Enable/disable
PUT _data_stream/my-data-stream/_options
{
"failure_store": {
"enabled": true
}
}
# Remove existing configuration
DELETE _data_stream/my-data-stream/_options
# Retrieve
GET _data_stream/my-data-stream/_options
{
"failure_store": {
"enabled": true
}
}
```
Future work:
- Document the new APIs
- Convert `DataStreamOptionsIT.java` to a yaml test.
After introducing #113505 we need to remove `cluster.logsdb.enabled` from
`StackPlugin` and `StackTemplateRegistry`.
(cherry picked from commit 487c5562aa)
Co-authored-by: Salvatore Campagna <93581129+salvatore-campagna@users.noreply.github.com>
**Introduction**
> In order to make adoption of failure stores simpler for all users, we
are introducing a new syntactical feature to index expression
resolution: The selector. > > Selectors, denoted with a :: followed by a
recognized suffix will allow users to specify which component of an
index abstraction they would like to operate on within an API call. In
this case, an index abstraction is a concrete index, data stream, or
alias; Any abstraction that can be resolved to a set of indices/shards.
We define a component of an index abstraction to be some searchable unit
of the index abstraction. > > To start, we will support two components:
data and failures. Concrete indices are their own data components, while
the data component for index aliases are all of the indices contained
therein. For data streams, the data component corresponds to their
backing indices. Data stream aliases mirror this, treating all backing
indices of the data streams they correspond to as their data component.
> > The failure component is only supported by data streams and data
stream aliases. The failure component of these abstractions refer to the
data streams' failure stores. Indices and index aliases do not have a
failure component.
For more details and examples see
https://github.com/elastic/elasticsearch/pull/113144. All this work has
been cherry picked from there.
**Purpose of this PR**
This PR is replacing the the indices options boolean constructor with
the builders. The goal is to give me and the reviewer a very narrow
scope change when we can ensure we did not make any mistakes during the
conversion. Also it will reduce a bit the change list in
https://github.com/elastic/elasticsearch/pull/113144/files.
This removes the possibility for a plugin to provide factory retention settings. Factory retention settings have been deprecated and completely replaced by #111972.
Note: this feature is not in use. If someone wants to set global retention they can use the cluster settings as defined in #111972.
* Assert that REST params are consumed iff supported (#114040)
REST APIs which declare their supported parameters must consume exactly
those parameters: consuming an unsupported parameter means that requests
including that parameter will be rejected, whereas failing to consume a
supported parameter means that this parameter has no effect and should
be removed.
This commit adds an assertion to verify that we are consuming the
correct parameters.
Closes#113854
* CI poke
* Ensure that supported params are computed once
A few of today's REST handler implementations compute a new set of
supported parameters on each request. This is needlessly inefficient
since the set never changes. This commit fixes those implementations and
adds assertions to verify that we are returning the exact same instance
each time.
Backport of #113953 to `8.x`
* Fix
* Make xpack.otel_data.registry.enabled default to true
* Update build.gradle
* Disable otel-data plugin for tests where it's not relevant
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>