In this PR we add support for the failure store for system data streams.
Specifically:
- We pass the system descriptor so the failure index can be created based on that.
- We extend the tests to ensure it works
- We remove a guard we had but I wasn't able to test it because it only gets triggered if the data stream gets created right after a failure in the ingest pipeline, and I didn't see how to add one (yet).
- We extend the system data stream migration to ensure this is also working.
This commit adds support for system data streams reindexing. The system data stream migration extends the existing system indices migration task and uses the data stream reindex API.
The system index migration task starts a reindex data stream task and tracks its status every second. Only one system index or system data stream is migrated at a time. If a data stream migration fails, the entire system index migration task will also fail.
Port of #123926
Transport actions have associated request and response classes. However,
the base type restrictions are not necessary to duplicate when creating
a map of transport actions. Relatedly, the ActionHandler class doesn't
actually need strongly typed action type and classes since they are lost
when shoved into the node client map. This commit removes these type
restrictions and generic parameters.
* [main] Move system indices migration to migrate plugin
It seems the best way to fix#122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.
Port of #123551
* [CI] Auto commit changes from spotless
* Fix compilation
* Fix tests
* Fix test
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
This field is only used (by security) for requests, having it in responses is redundant.
Also, we have a couple of responses that are singletons/quasi-enums where setting the value
needlessly might introduce some strange contention even though it's a plain store.
This isn't just a cosmetic change. It makes it clear at compile time that each response instance
is exclusively defined by the bytes that it is read from. This makes it easier to reason about the
validity of suggested optimizations like https://github.com/elastic/elasticsearch/pull/120010
In the create-index-from-source action, we should set the cause of the create index request so that it is clear in the logs. Without setting the cause on the request, the default value of api is used.
When reindexing a data stream, the ILM metadata is copied from the index metadata of the source index to the destination index. But the ILM state of the new index can be stuck if the source index was in an AsyncAction at the time of reindexing. To un-stick the new index, we call TransportRetryAction to retry the AsyncAction. In the past this action would only run if the index were in the error phase. This change includes an update to TransportRetryAction, which allows it to be run when the index is not in an error phase, if the parameter requireError is set to false.
There is no guarantee that a project has non-null persistent tasks [0].
Null check is already done in most places. This PR adds it in a few more
places.
[0] Since health-node is now a cluster-scoped persistent task, it has
become more likely for the project-scoped tasks to be null.
Relates: #123262
In the multi-project branch, we're making some changes to persistent
tasks and those changes can cause the persistent tasks custom to still
be `null`. This resulted in an NPE here, so I'm fixing the check here.
When reindexing a data stream, we currently add a write block to the source indices so that new documents cannot be added to the index while it is being reindexed. A write block still allows the index to be deleted and for the metadata to be updated. It is possible that ILM could delete a backing index or update a backing index's lifecycle metadata while it is being reindexed. To avoid this, this PR sets a read-only block on the source index. This block must be removed before source index can be deleted after it is replaced with the destination index.
When reindexing a data stream, remove the lifecycle name setting when creating the destination index, so that ILM does not process it. Add the setting back after adding the destination index to the data stream, at which point ILM can safely process it.
* Allow data stream reindex tasks to be re-run after completion
* Docs update
* Update docs/reference/migration/apis/data-stream-reindex.asciidoc
Co-authored-by: Keith Massey <keith.massey@elastic.co>
---------
Co-authored-by: Keith Massey <keith.massey@elastic.co>
When reindexing data stream indices, parts of the index metadata needs to be copied from the source index to destination index, so that ILM and data stream lifecycle function properly. This adds a new CopyLifecycleIndexMetadataTransportAction which copies the following metadata from a source index to a destination index:
- creation date setting
- rollover info
- ILM custom metadata
Fix race condition test bugs related to the reindex-data-stream-pipeline. For tests that add doc without timestamp, then add mapping with timestamp, ensure green between adding doc and adding mapping. This makes sure that doc has been written to all shards and thus that timestamp validation does not occur while doc is being written to a shard. Delete pipeline in Before method, then wait for it to be re-created by the MigrateTemplateRegistry.
ReindexDataStreamIndexAction.cleanupCluster called EsIntegTestCase.cleanupCluster, but did not override it. This caused EsIntegTestCase.cleanupCluster to be called twice, once in ReindexDataStreamIndexAction.cleanupCluster and once when the After annotation is called on EsIntegTestCase.
It is possible to create an index in 7.x with a single type. This fixes the CreateIndexFromSourceAction to not copy that type over when creating a destination index from a source index with a type.
A future.actionGet was missing from the delete pipeline action execution in the test cleanup, causing all tests to fail intermittently. Also replace actionGet with safeGet.
Add the pipeline "reindex-data-stream-pipeline" to the reindex request within ReindexDataStreamIndexAction. This cleans up documents as needed before inserting into the destination index. Currently, the pipeline only sets a timestamp field with a value of 0, if the document is missing a timestamp field. This is needed because existing indices which are added to a data stream may not contain a timestamp, but reindex validates that a timestamp field exists when creating data stream destination indices.
This pipeline is managed by ES, but can be overriden by users if necessary. To do this, the version field of the pipeline should be set to a value higher than the MigrateRegistry version.
* Fix incorrect value of reindex_required flag on ignored index warning
* Datastream reindex now uses unverified write block to allow retrying failed reindex
Assertion was using reference equality on two boxed longs. So assertion could produce false positives. Change to Objects.equals to check value and avoid null check.
Add step to the ReindexDatastreamIndexAction which refreshes the source index after setting it to read-only but before calling reindex. Without doing a refresh it is possible for docs from the source index to be missing from the destination index. This happens because the docs arrived before the source index is set to read-only, but because the index hasn't refreshed, the reindex action cannot see these updates.