* Move system indices migration to migrate plugin (#123551)
It seems the best way to fix#122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.
(cherry picked from commit 0a769c8391)
* Restore tests
I have ran this many times locally, and it never failed. Maybe there is
something "magical" in CI.
Added some additional info in the assertion logging.
(cherry picked from commit 894db68357)
With the introduction of our new backing algorithm and making rescoring
easier with the `rescore_vector` API, let's mark bbq as GA.
Additionally, this commit adds rolling upgrade tests to ensure
stability.
When marking read-only now flush and mark index as verified guaranteeing
that we can upgrade safely to next version with N-1 indices (becoming N-2).
Use this in the deprecation check.
* Skip HealthNodeUpgradeIT for some rolling upgrades
This skips part of the `HealthNodeUpgradeIT` test for the rolling
upgrade tests which use a cluster with a mix of 8.5.x and 8.6.x nodes,
which serve the health endpoint at `_internal/_health`, and 8.last
nodes, which serve it at `_health_report`. There is no sensible and
reliable way to test the endpoint in such clusters.
Closes#118157Closes#118158
This reverts #117106. Bwc tests fail, because older nodes are killed with the following error:
```
[2024-11-20T10:54:58,600][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v8.17.0-0] fatal error in thread [elasticsearch[v8.17.0-0
][clusterApplierService#updateTask][T#1]], exiting java.lang.AssertionError: provided source [{"_doc":{"_data_stream_timestamp":{"enabled":true},"_source":{},"properties":{"@timestamp":{"type":"date"},"k8s":{"properties":{"pod":{"properties":{"ip":{"type":"ip"},"name":{"type":"keyword"},"network":{"properties":{"rx":{"type":"long"},"tx":{"type":"long"}}},"uid":{"type":"keyword","time_series_dimension":true}}}}},"metricset":{"type":"keyword","time_series_dimension":true}}}}] differs from mapping [{"_doc":{"_data_stream_timestamp":{"enabled":true},"_source":{"mode":"synthetic"},"properties":{"@timestamp":{"type":"date"},"k8s":{"properties":{"pod":{"properties":{"ip":{"type":"ip"},"name":{"type":"keyword"},"network":{"properties":{"rx":{"type":"long"},"tx":{"type":"long"}}},"uid":{"type":"keyword","time_series_dimension":true}}}}},"metricset":{"type":"keyword","time_series_dimension":true}}}}]
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.mapper.DocumentMapper.<init>(DocumentMapper.java:66)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.newDocumentMapper(MapperService.java:588)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.updateMapping(MapperService.java:346)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.IndexService.updateMapping(IndexService.java:840)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndicesAndUpdateShards(IndicesClusterStateService.java:583)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.indices.cluster.IndicesClusterStateService.doApplyClusterState(IndicesClusterStateService.java:306)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:260)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:544)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:530)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:503)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:157)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:956)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:218)
at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:184)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.base/java.lang.Thread.run(Thread.java:1575)
```
The `mode` parameter no longer gets serialized for new indices. However on the older nodes still serialize the `mode` parameter, which caused the menioned assertion to fail. Reverting for now and see how best to address this bwc serialization issue.
We can only stop serializing mode, when all nodes are on the same version. Unfortunately we can't invoke `c.clusterTransportVersion().get()` from parser or builder, because that calling thread isn't allowed to call `clusterService.state()`.
This PR modifies `TransportVersionsFixupListener` to include all of
compatibility versions (not only TransportVersion) in the fixup.
`TransportVersionsFixupListener` spots the instances when the master has
been upgraded to the most recent code version, along with non-master
nodes, but some nodes are missing a "proper" (non-inferred) Transport
version. This PR adds another check to also ensure that we have real
(non-empty) system index mapping versions.
To do so, it modifies NodeInfo so it carries all of
CompatibilityVersions (TransportVersion +
SystemIndexDescriptor.MappingVersions).
This was initially done via a separate fixup listener + ad-hoc transport
action, but the 2 listeners "raced" to update ClusterState on the same
CompatibilityVersions structure; it just made sense to do it at the same
time.
The fixup is very similar to
https://github.com/elastic/elasticsearch/pull/110710, which does the
same for cluster features; plus, it adds a CI test to cover the bug
raised in https://github.com/elastic/elasticsearch/issues/112694
Closes https://github.com/elastic/elasticsearch/issues/112694
Backport #115639 to 8.x branch.
The main difference between other rolling upgrade tests is that these tests index more data while performing the rolling upgrade and no rollover is performed during rolling upgrade. For example this makes it more likely for merging to happen, which could uncover bwc bugs.
Note that currently both test suites start trial license so that synthetic source gets used.
* Backport
* Version fix
* Another
* Fix
* Fix again
* Skip
* One more
* Formatting fix
---------
Co-authored-by: Johannes Fredén <109296772+jfreden@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Fixes some faulty assertions in an upgrade test. Test failures only
manifest on the 8.16 branch since 9.x does not qualify for these upgrade
tests, and the change is not backported to 8.17 yet (unrelated CI
failures).
I validated this works by running it locally from the 8.16 branch.
Resolves: https://github.com/elastic/elasticsearch/issues/115410
Resolves: https://github.com/elastic/elasticsearch/issues/115411
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Multiple @Before methods in junit are run in random order. This commit
cosolidates the @Before methods of ParameterizedRollingUpgradeTestCase
since the code has interdependencies.
closes#114330
This PR exposes operator-defined, cluster-state role mappings in the
[Get role mappings
API](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-api-get-role-mapping.html).
Cluster-state role mappings are returned with a reserved suffix
`-read-only-operator-mapping`, to disambiguate with native role mappings
stored in the security index. CS role mappings are also marked with a
`_read_only` metadata flag. It's possible to query a CS role mapping
using its name both with and without the suffix.
CS role mappings can be viewed via the API, but cannot be modified. To
clarify this, the PUT and DELETE role mapping endpoints return header
warnings if native role mappings that name-clash with CS role mappings
are created, modified, or deleted.
The PR also prevents the creation or role mappings with names ending in
`-read-only-operator-mapping` to ensure that CS role mappings and native
role mappings can always be fully disambiguated.
Finally, the PR changes how CS role mappings are persisted in
cluster-state. CS role mappings are written (and read from disk) in the
`XContent` format. This format omits the role mapping's name. This means
that if CS role mappings are ever recovered from disk (e.g., during a
master-node restart), their names are erased. To address this, this PR
changes CS role mapping serialization to persist the name of a mapping
in a reserved metadata field, and recover it from metadata during
serialization. This allows us to persist the name without BWC-breaks in
role mapping `XContent` format. It also allows us to ensure that role
mappings are re-written to cluster state in the new, name-preserving
format the first time operator file settings are processed.
Depends on: https://github.com/elastic/elasticsearch/pull/114295
Relates: ES-9628
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* Reprocess operator file settings on service start (#114295)
Changes `FileSettingsService` to reprocess file settings on every
restart or master node change, even if versions match between file and
cluster-state metadata. If the file version is lower than the metadata
version, processing is still skipped to avoid applying stale settings.
This makes it easier for consumers of file settings to change their
behavior w.r.t. file settings contents. For instance, an update of how
role mappings are stored will automatically apply on the next restart,
without the need to manually increment the file settings version to
force reprocessing.
Relates: ES-9628
* Backport 114295
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Test LogsDB backward compatibility with a rolling upgrade and full cluster restart.
We try to start indexing logs using a `standard` index, then we switch to a `LogsDB` index.
We also improve the existing test which switches between the two index modes, `standard`
and `logs`.
This adds a new quantization mechanism for HNSW and flat indices. Here
we add `int4` quantization via the `int4_hnsw` and `int4_flat` index
types. This quantization methodology further reduces the memory required
for fast HNSW, meaning that the memory required is 8x smaller than with
regular float32 values.
8x reduction means that 1M 1024 dimension vectors goes from requiring
3.8GB to 477MB.
Recall continues to stay steady, there is some reduction that is
recoverable via slightly oversampling and reranking. For example over
500k CohereV3 vectors, only 5 extra vectors are required to be gathered
to achieve over 0.98 recall in a brute-force scenario.

Currently these tests run against any old cluster older than 8.0.0, but
the fix that allowed `index.mapper.dynamic` to exist is only available
in 7.17.22.
Adjust these tests to only run if old cluster is after version 7.17.21
and before 8.0.0
Currently when upgrading a 7.x cluster to 8.x with
`index.mapper.dynamic` index setting defined the following happens:
- In case of a full cluster restart upgrade, then the index setting gets archived and after the upgrade the cluster is in a green health.
- In case of a rolling cluster restart upgrade, then shards of indices with the index setting fail to allocate as nodes start with 8.x version. The result is that the cluster has a red health and the index setting isn't archived. Closing and opening the index should archive the index setting and allocate the shards.
The change is about ensuring the same behavior happens when upgrading a
cluster from 7.x to 8.x with indices that have the
`index.mapper.dynamic` index setting defined. By re-defining the
`index.mapper.dynamic `index setting with
`IndexSettingDeprecatedInV7AndRemovedInV8` property, the index is
allowed to exist in 7.x indices, but can't be defined in new indices
after the upgrade. This way we don't have to rely on index archiving and
upgrading via full cluster restart or rolling restart will yield the
same outcome.
Based on the test in #109301. Relates to #109160 and #96075
To simplify the migration away from version based skip checks in YAML specs,
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.
New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
This commit removes the legacy yaml rolling upgrade tests for vectors to the new rolling upgrade package.
Also, it adds rolling upgrade tests for `int8_hnsw`.
Simple test, sets up downsampling to run in the old cluster, then waits
for it to complete and verifies that downsampled indexes can get queried
in the mixed and upgraded clusters.
A Lucene limitation on doc values for UTF-8 fields does not allow us to
write keyword fields whose size is larger then 32K. This limits our
ability to map more than a certain number of dimension fields for time
series indices. Before introducing this change the tsid is created as a
catenation of dimension field names and values into a keyword field.
To overcome this limitation we hash the tsid. This PR is intended to be
used as a draft to test different options.
Note that, as a side effect, this reduces the size of the tsid field as
a result of storing far less data when the tsid is hashed. Anyway, we
expect tsid hashing to affect compression of doc values and resulting in
larger storage footprint. Effect on query latency needs to be evaluated
too.
Resolves#93564
Deprecated node_version field, made it optional(unused) in new parser
Added deprecation warning handler for mixed cluster
Split tests for old vs. current format
Add the ability to test for the original/old cluster features during a rolling upgrade
* Moving ALL_FEATURES to ESRestTestCase (and make it private - only usage)