Move metadata storage to Lucene (#50907)

Today we split the on-disk cluster metadata across many files: one file for the metadata of each index, plus one file for the global metadata and another for the manifest. Most metadata updates only touch a few of these files, but some must write them all. If a node holds a large number of indices then it's possible its disks are not fast enough to process a complete metadata update before timing out. In severe cases affecting master-eligible nodes this can prevent an election from succeeding. This commit uses Lucene as a metadata storage for the cluster state, and is a squashed version of the following PRs that were targeting a feature branch: * Introduce Lucene-based metadata persistence (#48733) This commit introduces `LucenePersistedState` which master-eligible nodes can use to persist the cluster metadata in a Lucene index rather than in many separate files. Relates #48701 * Remove per-index metadata without assigned shards (#49234) Today on master-eligible nodes we maintain per-index metadata files for every index. However, we also keep this metadata in the `LucenePersistedState`, and only use the per-index metadata files for importing dangling indices. However there is no point in importing a dangling index without any shard data, so we do not need to maintain these extra files any more. This commit removes per-index metadata files from nodes which do not hold any shards of those indices. Relates #48701 * Use Lucene exclusively for metadata storage (#50144) This moves metadata persistence to Lucene for all node types. It also reenables BWC and adds an interoperability layer for upgrades from prior versions. This commit disables a number of tests related to dangling indices and command-line tools. Those will be addressed in follow-ups. Relates #48701 * Add command-line tool support for Lucene-based metadata storage (#50179) Adds command-line tool support (unsafe-bootstrap, detach-cluster, repurpose, & shard commands) for the Lucene-based metadata storage. Relates #48701 * Use single directory for metadata (#50639) Earlier PRs for #48701 introduced a separate directory for the cluster state. This is not needed though, and introduces an additional unnecessary cognitive burden to the users. Co-Authored-By: David Turner <david.turner@elastic.co> * Add async dangling indices support (#50642) Adds support for writing out dangling indices in an asynchronous way. Also provides an option to avoid writing out dangling indices at all. Relates #48701 * Fold node metadata into new node storage (#50741) Moves node metadata to uses the new storage mechanism (see #48701) as the authoritative source. * Write CS asynchronously on data-only nodes (#50782) Writes cluster states out asynchronously on data-only nodes. The main reason for writing out the cluster state at all is so that the data-only nodes can snap into a cluster, that they can do a bit of bootstrap validation and so that the shard recovery tools work. Cluster states that are written asynchronously have their voting configuration adapted to a non existing configuration so that these nodes cannot mistakenly become master even if their node role is changed back and forth. Relates #48701 * Remove persistent cluster settings tool (#50694) Adds the elasticsearch-node remove-settings tool to remove persistent settings from the on disk cluster state in case where it contains incompatible settings that prevent the cluster from forming. Relates #48701 * Make cluster state writer resilient to disk issues (#50805) Adds handling to make the cluster state writer resilient to disk issues. Relates to #48701 * Omit writing global metadata if no change (#50901) Uses the same optimization for the new cluster state storage layer as the old one, writing global metadata only when changed. Avoids writing out the global metadata if none of the persistent fields changed. Speeds up server:integTest by ~10%. Relates #48701 * DanglingIndicesIT should ensure node removed first (#50896) These tests occasionally failed because the deletion was submitted before the restarting node was removed from the cluster, causing the deletion not to be fully acked. This commit fixes this by checking the restarting node has been removed from the cluster. Co-authored-by: David Turner <david.turner@elastic.co>
2025-06-28 17:34:17 -04:00 · 2020-01-13 14:10:02 +01:00 · 2020-01-13 14:10:02 +01:00 · a0513217db
commit a0513217db
parent 39a5d755c9
51 changed files with 3402 additions and 1095 deletions
--- a/docs/reference/commands/node-tool.asciidoc
+++ b/docs/reference/commands/node-tool.asciidoc
@ -3,9 +3,9 @@

 The `elasticsearch-node` command enables you to perform certain unsafe
 operations on a node that are only possible while it is shut down. This command
-allows you to adjust the <<modules-node,role>> of a node and may be able to
-recover some data after a disaster or start a node even if it is incompatible
-with the data on disk.
+allows you to adjust the <<modules-node,role>> of a node, unsafely edit cluster
+settings and may be able to recover some data after a disaster or start a node
+even if it is incompatible with the data on disk.

 [float]
 === Synopsis
@ -20,13 +20,17 @@ bin/elasticsearch-node repurpose|unsafe-bootstrap|detach-cluster|override-versio
 [float]
 === Description

-This tool has four modes:
+This tool has five modes:

 * `elasticsearch-node repurpose` can be used to delete unwanted data from a
  node if it used to be a <<data-node,data node>> or a
  <<master-node,master-eligible node>> but has been repurposed not to have one
  or other of these roles.

+* `elasticsearch-node remove-settings` can be used to remove persistent settings
+   from the cluster state in case where it contains incompatible settings that
+   prevent the cluster from forming.
+
 * `elasticsearch-node unsafe-bootstrap` can be used to perform _unsafe cluster
  bootstrapping_.  It forces one of the nodes to form a brand-new cluster on
  its own, using its local copy of the cluster metadata.
@ -76,6 +80,26 @@ The tool provides a summary of the data to be deleted and asks for confirmation
 before making any changes. You can get detailed information about the affected
 indices and shards by passing the verbose (`-v`) option.

+[float]
+==== Removing persistent cluster settings
+
+There may be situations where a node contains persistent cluster
+settings that prevent the cluster from forming. Since the cluster cannot form,
+it is not possible to remove these settings using the
+<<cluster-update-settings>> API.
+
+The `elasticsearch-node remove-settings` tool allows you to forcefully remove
+those persistent settings from the on-disk cluster state. The tool takes a
+list of settings as parameters that should be removed, and also supports
+wildcard patterns.
+
+The intended use is:
+
+* Stop the node
+* Run `elasticsearch-node remove-settings name-of-setting-to-remove` on the node
+* Repeat for all other master-eligible nodes
+* Start the nodes
+
 [float]
 ==== Recovering data after a disaster

@ -143,9 +167,9 @@ If there is at least one remaining master-eligible node, but it is not possible
 to restart a majority of them, then the `elasticsearch-node unsafe-bootstrap`
 command will unsafely override the cluster's <<modules-discovery-voting,voting
 configuration>> as if performing another
-<<modules-discovery-bootstrap-cluster,cluster bootstrapping process>>. 
+<<modules-discovery-bootstrap-cluster,cluster bootstrapping process>>.
 The target node can then form a new cluster on its own by using
-the cluster metadata held locally on the target node. 
+the cluster metadata held locally on the target node.

 [WARNING]
 These steps can lead to arbitrary data loss since the target node may not hold the latest cluster
@ -290,6 +314,9 @@ it can join a different cluster.
 `override-version`:: Overwrites the version number stored in the data path so
 that a node can start despite being incompatible with the on-disk data.

+`remove-settings`:: Forcefully removes the provided persistent cluster settings
+from the on-disk cluster state.
+
 `-E <KeyValuePair>`:: Configures a setting.

 `-h, --help`:: Returns all of the command parameters.
@ -346,6 +373,40 @@ Confirm [y/N] y
 Node successfully repurposed to no-master and no-data.
 ----

+[float]
+==== Removing persistent cluster settings
+
+If your nodes contain persistent cluster settings that prevent the cluster
+from forming, i.e., can't be removed using the <<cluster-update-settings>> API,
+you can run the following commands to remove one or more cluster settings.
+
+[source,txt]
+----
+node$ ./bin/elasticsearch-node remove-settings xpack.monitoring.exporters.my_exporter.host
+
+    WARNING: Elasticsearch MUST be stopped before running this tool.
+
+The following settings will be removed:
+xpack.monitoring.exporters.my_exporter.host: "10.1.2.3"
+
+You should only run this tool if you have incompatible settings in the
+cluster state that prevent the cluster from forming.
+This tool can cause data loss and its use should be your last resort.
+
+Do you want to proceed?
+
+Confirm [y/N] y
+
+Settings were successfully removed from the cluster state
+----
+
+You can also use wildcards to remove multiple settings, for example using
+
+[source,txt]
+----
+node$ ./bin/elasticsearch-node remove-settings xpack.monitoring.*
+----
+
 [float]
 ==== Unsafe cluster bootstrapping