[8.x] [DOCS] Concept cleanup 2 - ES settings (#119373) (#119642)

2025-04-24 15:17:30 -04:00 · 2025-01-10 10:31:16 -05:00 · 2025-01-10 10:31:16 -05:00 · ae3db6042a
commit ae3db6042a
parent 8a14c1468d
51 changed files with 959 additions and 856 deletions
--- a/docs/reference/modules/cluster.asciidoc
+++ b/docs/reference/modules/cluster.asciidoc
@ -27,7 +27,23 @@ include::cluster/shards_allocation.asciidoc[]

 include::cluster/disk_allocator.asciidoc[]

-include::cluster/allocation_awareness.asciidoc[]
+[[shard-allocation-awareness-settings]]
+==== Shard allocation awareness settings
+
+You can use <<custom-node-attributes,custom node attributes>> as _awareness attributes_ to enable {es}
+to take your physical hardware configuration into account when allocating shards.
+If {es} knows which nodes are on the same physical server, in the same rack, or
+in the same zone, it can distribute the primary shard and its replica shards to
+minimize the risk of losing all shard copies in the event of a failure. <<shard-allocation-awareness,Learn more about shard allocation awareness>>.
+
+`cluster.routing.allocation.awareness.attributes`::
+      (<<dynamic-cluster-setting,Dynamic>>)
+      The node attributes that {es} should use as awareness attributes. For example, if you have a `rack_id` attribute that specifies the rack in which each node resides, you can set this setting to `rack_id` to ensure that primary and replica shards are not allocated on the same rack. You can specify multiple attributes as a comma-separated list.
+
+`cluster.routing.allocation.awareness.force.*`:: 
+        (<<dynamic-cluster-setting,Dynamic>>)
+        The shard allocation awareness values that must exist for shards to be reallocated in case of location failure. Learn more about <<forced-awareness,forced awareness>>.
+      

 include::cluster/allocation_filtering.asciidoc[]

--- a/docs/reference/modules/cluster/allocation_awareness.asciidoc
+++ b/docs/reference/modules/cluster/allocation_awareness.asciidoc
@ -1,5 +1,5 @@
 [[shard-allocation-awareness]]
-==== Shard allocation awareness
+== Shard allocation awareness

 You can use custom node attributes as _awareness attributes_ to enable {es}
 to take your physical hardware configuration into account when allocating shards.
@ -7,12 +7,7 @@ If {es} knows which nodes are on the same physical server, in the same rack, or
 in the same zone, it can distribute the primary shard and its replica shards to
 minimize the risk of losing all shard copies in the event of a failure.

-When shard allocation awareness is enabled with the
-<<dynamic-cluster-setting,dynamic>>
-`cluster.routing.allocation.awareness.attributes` setting, shards are only
-allocated to nodes that have values set for the specified awareness attributes.
-If you use multiple awareness attributes, {es} considers each attribute
-separately when allocating shards.
+When shard allocation awareness is enabled with the `cluster.routing.allocation.awareness.attributes` setting, shards are only allocated to nodes that have values set for the specified awareness attributes. If you use multiple awareness attributes, {es} considers each attribute separately when allocating shards.

 NOTE: The number of attribute values determines how many shard copies are
 allocated in each location. If the number of nodes in each location is
@ -22,11 +17,11 @@ unassigned.
 TIP: Learn more about <<high-availability-cluster-design-large-clusters,designing resilient clusters>>.

 [[enabling-awareness]]
-===== Enabling shard allocation awareness
+=== Enabling shard allocation awareness

 To enable shard allocation awareness:

-. Specify the location of each node with a custom node attribute. For example, 
+. Specify the location of each node with a <<custom-node-attributes,custom node attribute>>. For example, 
 if you want Elasticsearch to distribute shards across different racks, you might 
 use an awareness attribute called `rack_id`. 
 +
@ -94,7 +89,7 @@ copies of a particular shard from being allocated in the same location, you can
 enable forced awareness.

 [[forced-awareness]]
-===== Forced awareness
+=== Forced awareness

 By default, if one location fails, {es} spreads its shards across the remaining
 locations. This might be undesirable if the cluster does not have sufficient
--- a/docs/reference/modules/cluster/allocation_filtering.asciidoc
+++ b/docs/reference/modules/cluster/allocation_filtering.asciidoc
@ -6,7 +6,7 @@ allocates shards from any index. These cluster wide filters are applied in
 conjunction with <<shard-allocation-filtering, per-index allocation filtering>>
 and <<shard-allocation-awareness, allocation awareness>>.

-Shard allocation filters can be based on custom node attributes or the built-in
+Shard allocation filters can be based on <<custom-node-attributes,custom node attributes>> or the built-in
 `_name`, `_host_ip`, `_publish_ip`, `_ip`, `_host`, `_id` and `_tier` attributes.

 The `cluster.routing.allocation` settings are <<dynamic-cluster-setting,dynamic>>, enabling live indices to
@ -59,9 +59,9 @@ The cluster allocation settings support the following built-in attributes:

 NOTE: `_tier` filtering is based on <<modules-node, node>> roles. Only
 a subset of roles are <<data-tiers, data tier>> roles, and the generic
-<<data-node, data role>> will match any tier filtering.
+<<data-node-role, data role>> will match any tier filtering.
 a subset of roles that are <<data-tiers, data tier>> roles, but the generic
-<<data-node, data role>> will match any tier filtering.
+<<data-node-role, data role>> will match any tier filtering.


 You can use wildcards when specifying attribute values, for example:
--- a/docs/reference/modules/cluster/disk_allocator.asciidoc
+++ b/docs/reference/modules/cluster/disk_allocator.asciidoc
@ -41,6 +41,23 @@ on the affected node drops below the high watermark, {es} automatically removes
 the write block. Refer to <<fix-watermark-errors,Fix watermark errors>> to 
 resolve persistent watermark errors.

+[NOTE]
+.Max headroom settings
+===================================================
+
+Max headroom settings apply only when watermark settings are percentages or ratios. 
+
+A max headroom value is intended to cap the required free disk space before hitting
+the respective watermark. This is useful for servers with larger disks, where a percentage or ratio watermark could translate to an overly large free disk space requirement. In this case, the max headroom can be used to cap the required free disk space amount.
+
+For example, where `cluster.routing.allocation.disk.watermark.flood_stage` is 95% and `cluster.routing.allocation.disk.watermark.flood_stage.max_headroom` is 100GB, this means that:
+
+* For a smaller disk, e.g., of 100GB, the flood watermark will hit at 95%, meaning at 5GB of free space, since 5GB is smaller than the 100GB max headroom value.
+* For a larger disk, e.g., of 100TB, the flood watermark will hit at 100GB of free space. That is because the 95% flood watermark alone would require 5TB of free disk space, but is capped by the max headroom setting to 100GB.
+
+Max headroom settings have their default values only if their respective watermark settings are not explicitly set. If watermarks are explicitly set, then the max headroom settings do not have their default values, and need to be explicitly set if they are needed.
+===================================================
+
 [[disk-based-shard-allocation-does-not-balance]]
 [TIP]
 ====
@ -100,18 +117,7 @@ is now `true`. The setting will be removed in a future release.
 +
 --
 (<<dynamic-cluster-setting,Dynamic>>)
-Controls the flood stage watermark, which defaults to 95%. {es} enforces a read-only index block (`index.blocks.read_only_allow_delete`) on every index that has one or more shards allocated on the node, and that has at least one disk exceeding the flood stage. This setting is a last resort to prevent nodes from running out of disk space. The index block is automatically released when the disk utilization falls below the high watermark. Similarly to the low and high watermark values, it can alternatively be set to a ratio value, e.g., `0.95`, or an absolute byte value.
-
-An example of resetting the read-only index block on the `my-index-000001` index:
-
-[source,console]
--------------------------------------------------
-PUT /my-index-000001/_settings
-{
-  "index.blocks.read_only_allow_delete": null
-}
--------------------------------------------------
-// TEST[setup:my_index]
+Controls the flood stage watermark, which defaults to 95%. {es} enforces a read-only index block (<<index-block-settings,`index.blocks.read_only_allow_delete`>>) on every index that has one or more shards allocated on the node, and that has at least one disk exceeding the flood stage. This setting is a last resort to prevent nodes from running out of disk space. The index block is automatically released when the disk utilization falls below the high watermark. Similarly to the low and high watermark values, it can alternatively be set to a ratio value, e.g., `0.95`, or an absolute byte value.
 --
 // end::cluster-routing-flood-stage-tag[]

@ -121,10 +127,10 @@ Defaults to 100GB when
 `cluster.routing.allocation.disk.watermark.flood_stage` is not explicitly set.
 This caps the amount of free space required.

-NOTE: You cannot mix the usage of percentage/ratio values and byte values across
+NOTE: You can't mix the usage of percentage/ratio values and byte values across
 the `cluster.routing.allocation.disk.watermark.low`, `cluster.routing.allocation.disk.watermark.high`,
 and `cluster.routing.allocation.disk.watermark.flood_stage` settings. Either all values
-are set to percentage/ratio values, or all are set to byte values. This enforcement is
+must be set to percentage/ratio values, or all must be set to byte values. This is required
 so that {es} can validate that the settings are internally consistent, ensuring that the
 low disk threshold is less than the high disk threshold, and the high disk threshold is
 less than the flood stage threshold. A similar comparison check is done for the max
@ -150,44 +156,6 @@ set. This caps the amount of free space required on dedicated frozen nodes.
    cluster. Defaults to `30s`.

 NOTE: Percentage values refer to used disk space, while byte values refer to
-free disk space. This can be confusing, since it flips the meaning of high and
+free disk space. This can be confusing, because it flips the meaning of high and
 low. For example, it makes sense to set the low watermark to 10gb and the high
-watermark to 5gb, but not the other way around.
-
-An example of updating the low watermark to at least 100 gigabytes free, a high
-watermark of at least 50 gigabytes free, and a flood stage watermark of 10
-gigabytes free, and updating the information about the cluster every minute:
-
-[source,console]
--------------------------------------------------
-PUT _cluster/settings
-{
-  "persistent": {
-    "cluster.routing.allocation.disk.watermark.low": "100gb",
-    "cluster.routing.allocation.disk.watermark.high": "50gb",
-    "cluster.routing.allocation.disk.watermark.flood_stage": "10gb",
-    "cluster.info.update.interval": "1m"
-  }
-}
--------------------------------------------------
-
-Concerning the max headroom settings for the watermarks, please note
-that these apply only in the case that the watermark settings are percentages/ratios.
-The aim of a max headroom value is to cap the required free disk space before hitting
-the respective watermark. This is especially useful for servers with larger
-disks, where a percentage/ratio watermark could translate to a big free disk space requirement,
-and the max headroom can be used to cap the required free disk space amount.
-As an example, let us take the default settings for the flood watermark.
-It has a 95% default value, and the flood max headroom setting has a default value of 100GB.
-This means that:
-
-* For a smaller disk, e.g., of 100GB, the flood watermark will hit at 95%, meaning at 5GB
-of free space, since 5GB is smaller than the 100GB max headroom value.
-* For a larger disk, e.g., of 100TB, the flood watermark will hit at 100GB of free space.
-That is because the 95% flood watermark alone would require 5TB of free disk space, but
-that is capped by the max headroom setting to 100GB.
-
-Finally, the max headroom settings have their default values only if their respective watermark
-settings are not explicitly set (thus, they have their default percentage values).
-If watermarks are explicitly set, then the max headroom settings do not have their default values,
-and would need to be explicitly set if they are desired.
+watermark to 5gb, but not the other way around.
--- a/docs/reference/modules/cluster/misc.asciidoc
+++ b/docs/reference/modules/cluster/misc.asciidoc
@ -1,6 +1,9 @@
 [[misc-cluster-settings]]
 === Miscellaneous cluster settings

+[[cluster-name]]
+include::{es-ref-dir}/setup/important-settings/cluster-name.asciidoc[]
+
 [discrete]
 [[cluster-read-only]]
 ==== Metadata
--- a/docs/reference/modules/discovery/bootstrapping.asciidoc
+++ b/docs/reference/modules/discovery/bootstrapping.asciidoc
@ -2,7 +2,7 @@
 === Bootstrapping a cluster

 Starting an Elasticsearch cluster for the very first time requires the initial
-set of <<master-node,master-eligible nodes>> to be explicitly defined on one or
+set of <<master-node-role,master-eligible nodes>> to be explicitly defined on one or
 more of the master-eligible nodes in the cluster. This is known as _cluster
 bootstrapping_. This is only required the first time a cluster starts up.
 Freshly-started nodes that are joining a running cluster obtain this
--- a/docs/reference/modules/discovery/publishing.asciidoc
+++ b/docs/reference/modules/discovery/publishing.asciidoc
@ -1,5 +1,23 @@
+[[cluster-state-overview]]
+=== Cluster state
+
+The _cluster state_ is an internal data structure which keeps track of a
+variety of information needed by every node, including:
+
+* The identity and attributes of the other nodes in the cluster
+
+* Cluster-wide settings
+
+* Index metadata, including the mapping and settings for each index
+
+* The location and status of every shard copy in the cluster
+
+The elected master node ensures that every node in the cluster has a copy of
+the same cluster state. The <<cluster-state,cluster state API>> lets you retrieve a
+representation of this internal state for debugging or diagnostic purposes.
+
 [[cluster-state-publishing]]
-=== Publishing the cluster state
+==== Publishing the cluster state

 The elected master node is the only node in a cluster that can make changes to
 the cluster state. The elected master node processes one batch of cluster state
@ -58,3 +76,16 @@ speed of the storage on each master-eligible node, as well as the reliability
 and latency of the network interconnections between all nodes in the cluster.
 You must therefore ensure that the storage and networking available to the
 nodes in your cluster are good enough to meet your performance goals.
+
+[[dangling-index]]
+==== Dangling indices
+
+When a node joins the cluster, if it finds any shards stored in its local
+data directory that do not already exist in the cluster state, it will consider
+those shards to belong to a "dangling" index. You can list, import or
+delete dangling indices using the <<dangling-indices-api,Dangling indices
+API>>.
+
+NOTE: The API cannot offer any guarantees as to whether the imported data
+truly represents the latest state of the data when the index was still part
+of the cluster.
--- a/docs/reference/modules/discovery/voting.asciidoc
+++ b/docs/reference/modules/discovery/voting.asciidoc
@ -2,7 +2,7 @@
 === Voting configurations

 Each {es} cluster has a _voting configuration_, which is the set of
-<<master-node,master-eligible nodes>> whose responses are counted when making
+<<master-node-role,master-eligible nodes>> whose responses are counted when making
 decisions such as electing a new master or committing a new cluster state.
 Decisions are made only after a majority (more than half) of the nodes in the
 voting configuration respond.
--- a/docs/reference/modules/gateway.asciidoc
+++ b/docs/reference/modules/gateway.asciidoc
@ -1,10 +1,11 @@
 [[modules-gateway]]
 === Local gateway settings

+[[dangling-indices]]
 The local gateway stores the cluster state and shard data across full
 cluster restarts.

-The following _static_ settings, which must be set on every <<master-node,master-eligible node>>,
+The following _static_ settings, which must be set on every <<master-node-role,master-eligible node>>,
 control how long a freshly elected master should wait before it tries to
 recover the <<cluster-state,cluster state>> and the cluster's data.

@ -36,17 +37,4 @@ These settings can be configured in `elasticsearch.yml` as follows:
 gateway.expected_data_nodes: 3
 gateway.recover_after_time: 600s
 gateway.recover_after_data_nodes: 3
--------------------------------------------------
-
-[[dangling-indices]]
-==== Dangling indices
-
-When a node joins the cluster, if it finds any shards stored in its local
-data directory that do not already exist in the cluster, it will consider
-those shards to belong to a "dangling" index. You can list, import or
-delete dangling indices using the <<dangling-indices-api,Dangling indices
-API>>.
-
-NOTE: The API cannot offer any guarantees as to whether the imported data
-truly represents the latest state of the data when the index was still part
-of the cluster.
+--------------------------------------------------
--- a/docs/reference/modules/indices/fielddata.asciidoc
+++ b/docs/reference/modules/indices/fielddata.asciidoc
@ -5,10 +5,6 @@ The field data cache contains <<fielddata-mapping-param, field data>> and <<eage
 which are both used to support aggregations on certain field types.
 Since these are on-heap data structures, it is important to monitor the cache's use.

-[discrete]
-[[fielddata-sizing]]
-==== Cache size
-
 The entries in the cache are expensive to build, so the default behavior is
 to keep the cache loaded in memory. The default cache size is unlimited,
 causing the cache to grow until it reaches the limit set by the <<fielddata-circuit-breaker, field data circuit breaker>>. This behavior can be configured.
@ -20,16 +16,12 @@ at the cost of rebuilding the cache as needed.
 If the circuit breaker limit is reached, further requests that increase the cache
 size will be prevented. In this case you should manually <<indices-clearcache, clear the cache>>.

+TIP: You can monitor memory usage for field data as well as the field data circuit
+breaker using
+the <<cluster-nodes-stats,nodes stats API>> or the <<cat-fielddata,cat fielddata API>>.
+
 `indices.fielddata.cache.size`::
 (<<static-cluster-setting,Static>>)
 The max size of the field data cache, eg `38%` of node heap space, or an
 absolute value, eg `12GB`. Defaults to unbounded. If you choose to set it,
-it should be smaller than <<fielddata-circuit-breaker>> limit.
-
-[discrete]
-[[fielddata-monitoring]]
-==== Monitoring field data
-
-You can monitor memory usage for field data as well as the field data circuit
-breaker using
-the <<cluster-nodes-stats,nodes stats API>> or the <<cat-fielddata,cat fielddata API>>.
+it should be smaller than <<fielddata-circuit-breaker>> limit.
--- a/docs/reference/modules/indices/request_cache.asciidoc
+++ b/docs/reference/modules/indices/request_cache.asciidoc
@ -1,4 +1,4 @@
-[[shard-request-cache]]
+[[shard-request-cache-settings]]
 === Shard request cache settings

 When a search request is run against an index or against many indices, each
@ -10,139 +10,16 @@ The shard-level request cache module caches the local results on each shard.
 This allows frequently used (and potentially heavy) search requests to return
 results almost instantly. The requests cache is a very good fit for the logging
 use case, where only the most recent index is being actively updated --
-results from older indices will be served directly from the cache.
+results from older indices will be served directly from the cache. You can use shard request cache settings to control the size and expiration of the cache.

-[IMPORTANT]
-===================================
-
-By default, the requests cache will only cache the results of search requests
-where `size=0`, so it will not cache `hits`,
-but it will cache `hits.total`,  <<search-aggregations,aggregations>>, and
-<<search-suggesters,suggestions>>.
-
-Most queries that use `now` (see <<date-math>>) cannot be cached.
-
-Scripted queries that use the API calls which are non-deterministic, such as
-`Math.random()` or `new Date()` are not cached.
-===================================
-
-[discrete]
-==== Cache invalidation
-
-The cache is smart -- it keeps the same _near real-time_ promise as uncached
-search.
-
-Cached results are invalidated automatically whenever the shard refreshes to
-pick up changes to the documents or when you update the mapping. In other
-words you will always get the same results from the cache as you would for an
-uncached search request.
-
-The longer the refresh interval, the longer that cached entries will remain
-valid even if there are changes to the documents. If the cache is full, the
-least recently used cache keys will be evicted.
-
-The cache can be expired manually with the <<indices-clearcache,`clear-cache` API>>:
-
-[source,console]
------------------------
-POST /my-index-000001,my-index-000002/_cache/clear?request=true
------------------------
-// TEST[s/^/PUT my-index-000001\nPUT my-index-000002\n/]
-
-[discrete]
-==== Enabling and disabling caching
-
-The cache is enabled by default, but can be disabled when creating a new
-index as follows:
-
-[source,console]
-----------------------------
-PUT /my-index-000001
-{
-  "settings": {
-    "index.requests.cache.enable": false
-  }
-}
-----------------------------
-
-It can also be enabled or disabled dynamically on an existing index with the
-<<indices-update-settings,`update-settings`>> API:
-
-[source,console]
-----------------------------
-PUT /my-index-000001/_settings
-{ "index.requests.cache.enable": true }
-----------------------------
-// TEST[continued]
-
-
-[discrete]
-==== Enabling and disabling caching per request
-
-The `request_cache` query-string parameter can be used to enable or disable
-caching on a *per-request* basis. If set, it overrides the index-level setting:
-
-[source,console]
-----------------------------
-GET /my-index-000001/_search?request_cache=true
-{
-  "size": 0,
-  "aggs": {
-    "popular_colors": {
-      "terms": {
-        "field": "colors"
-      }
-    }
-  }
-}
-----------------------------
-// TEST[continued]
-
-Requests where `size` is greater than 0 will not be cached even if the request cache is
-enabled in the index settings. To cache these requests you will need to use the
-query-string parameter detailed here.
-
-[discrete]
-==== Cache key
-
-A hash of the whole JSON body is used as the cache key. This means that if the JSON
-changes -- for instance if keys are output in a different order -- then the
-cache key will not be recognised.
-
-TIP: Most JSON libraries support a _canonical_ mode which ensures that JSON
-keys are always emitted in the same order. This canonical mode can be used in
-the application to ensure that a request is always serialized in the same way.
+To learn more about the shard request cache, see <<shard-request-cache>>.

 [discrete]
 ==== Cache settings

-The cache is managed at the node level, and has a default maximum size of `1%`
-of the heap. This can be changed in the `config/elasticsearch.yml` file with:
+`indices.requests.cache.size`::
+(<<static-cluster-setting,Static>>) The maximum size of the cache, as a percentage of the heap. Default: `1%`.

-[source,yaml]
--------------------------------
-indices.requests.cache.size: 2%
--------------------------------
+`indices.requests.cache.expire`::
+(<<static-cluster-setting,Static>>) The TTL for cached results. Stale results are automatically invalidated when the index is refreshed, so you shouldn't need to use this setting.

-Also, you can use the +indices.requests.cache.expire+ setting to specify a TTL
-for cached results, but there should be no reason to do so. Remember that
-stale results are automatically invalidated when the index is refreshed. This
-setting is provided for completeness' sake only.
-
-[discrete]
-==== Monitoring cache usage
-
-The size of the cache (in bytes) and the number of evictions can be viewed
-by index, with the <<indices-stats,`indices-stats`>> API:
-
-[source,console]
------------------------
-GET /_stats/request_cache?human
------------------------
-
-or by node with the <<cluster-nodes-stats,`nodes-stats`>> API:
-
-[source,console]
------------------------
-GET /_nodes/stats/indices/request_cache?human
------------------------
--- a/docs/reference/modules/network.asciidoc
+++ b/docs/reference/modules/network.asciidoc
@ -286,3 +286,22 @@ include::remote-cluster-network.asciidoc[]
 include::network/tracers.asciidoc[]

 include::network/threading.asciidoc[]
+
+[[readiness-tcp-port]]
+==== TCP readiness port
+
+preview::[]
+
+If configured, a node can open a TCP port when the node is in a ready state. A node is deemed
+ready when it has successfully joined a cluster. In a single node configuration, the node is
+said to be ready, when it's able to accept requests.
+
+To enable the readiness TCP port, use the `readiness.port` setting. The readiness service will bind to
+all host addresses.
+
+If the node leaves the cluster, or the <<put-shutdown,Shutdown API>> is used to mark the node
+for shutdown, the readiness port is immediately closed.
+
+A successful connection to the readiness TCP port signals that the {es} node is ready. When a client
+connects to the readiness port, the server simply terminates the socket connection. No data is sent back
+to the client. If a client cannot connect to the readiness port, the node is not ready.
--- a/docs/reference/modules/node.asciidoc
+++ b/docs/reference/modules/node.asciidoc
@ -1,5 +1,5 @@
 [[modules-node]]
-=== Nodes
+=== Node settings

 Any time that you start an instance of {es}, you are starting a _node_. A
 collection of connected nodes is called a <<modules-cluster,cluster>>. If you
@ -18,24 +18,33 @@ TIP: The performance of an {es} node is often limited by the performance of the
 Review our recommendations for optimizing your storage for <<indexing-use-faster-hardware,indexing>> and 
 <<search-use-faster-hardware,search>>.

+[[node-name-settings]]
+==== Node name setting
+
+include::{es-ref-dir}/setup/important-settings/node-name.asciidoc[]
+
 [[node-roles]]
-==== Node roles
+==== Node role settings

 You define a node's roles by setting `node.roles` in `elasticsearch.yml`. If you
 set `node.roles`, the node is only assigned the roles you specify. If you don't
 set `node.roles`, the node is assigned the following roles:

-* `master`
-* `data`
+* [[master-node]]`master`
+* [[data-node]]`data`
 * `data_content`
 * `data_hot`
 * `data_warm`
 * `data_cold`
 * `data_frozen`
 * `ingest`
-* `ml`
+* [[ml-node]]`ml`
 * `remote_cluster_client`
-* `transform`
+* [[transform-node]]`transform`
+
+The following additional roles are available:
+
+* `voting_only`

 [IMPORTANT]
 ====
@ -65,386 +74,7 @@ As the cluster grows and in particular if you have large {ml} jobs or
 {ctransforms}, consider separating dedicated master-eligible nodes from
 dedicated data nodes, {ml} nodes, and {transform} nodes.

-<<master-node,Master-eligible node>>::
-
-A node that has the `master` role, which makes it eligible to be
-<<modules-discovery,elected as the _master_ node>>, which controls the cluster.
-
-<<data-node,Data node>>::
-
-A node that has one of several data roles. Data nodes hold data and perform data
-related operations such as CRUD, search, and aggregations. A node with a generic `data` role can fill any of the specialized data node roles.
-
-<<node-ingest-node,Ingest node>>::
-
-A node that has the `ingest` role. Ingest nodes are able to apply an
-<<ingest,ingest pipeline>> to a document in order to transform and enrich the
-document before indexing. With a heavy ingest load, it makes sense to use
-dedicated ingest nodes and to not include the `ingest` role from nodes that have
-the `master` or `data` roles.
-
-<<remote-node,Remote-eligible node>>::
-
-A node that has the `remote_cluster_client` role, which makes it eligible to act
-as a remote client.
-
-<<ml-node,Machine learning node>>::
-
-A node that has the `ml` role. If you want to use {ml-features}, there must be
-at least one {ml} node in your cluster. For more information, see
-<<ml-settings>> and {ml-docs}/index.html[Machine learning in the {stack}].
-
-<<transform-node,{transform-cap} node>>::
-
-A node that has the `transform` role. If you want to use {transforms}, there
-must be at least one {transform} node in your cluster. For more information, see
-<<transform-settings>> and <<transforms>>.
-
-[NOTE]
-[[coordinating-node]]
-.Coordinating node
-===============================================
-
-Requests like search requests or bulk-indexing requests may involve data held
-on different data nodes. A search request, for example, is executed in two
-phases which are coordinated by the node which receives the client request --
-the _coordinating node_.
-
-In the _scatter_ phase, the coordinating node forwards the request to the data
-nodes which hold the data. Each data node executes the request locally and
-returns its results to the coordinating node. In the _gather_ phase, the
-coordinating node reduces each data node's results into a single global
-result set.
-
-Every node is implicitly a coordinating node. This means that a node that has
-an explicit empty list of roles via `node.roles` will only act as a coordinating
-node, which cannot be disabled. As a result, such a node needs to have enough
-memory and CPU in order to deal with the gather phase.
-
-===============================================
-
-[[master-node]]
-==== Master-eligible node
-
-The master node is responsible for lightweight cluster-wide actions such as
-creating or deleting an index, tracking which nodes are part of the cluster,
-and deciding which shards to allocate to which nodes. It is important for
-cluster health to have a stable master node.
-
-Any master-eligible node that is not a <<voting-only-node,voting-only node>> may
-be elected to become the master node by the <<modules-discovery,master election
-process>>.
-
-IMPORTANT: Master nodes must have a `path.data` directory whose contents
-persist across restarts, just like data nodes, because this is where the
-cluster metadata is stored. The cluster metadata describes how to read the data
-stored on the data nodes, so if it is lost then the data stored on the data
-nodes cannot be read.
-
-[[dedicated-master-node]]
-===== Dedicated master-eligible node
-
-It is important for the health of the cluster that the elected master node has
-the resources it needs to fulfill its responsibilities. If the elected master
-node is overloaded with other tasks then the cluster will not operate well. The
-most reliable way to avoid overloading the master with other tasks is to
-configure all the master-eligible nodes to be _dedicated master-eligible nodes_
-which only have the `master` role, allowing them to focus on managing the
-cluster. Master-eligible nodes will still also behave as
-<<coordinating-node,coordinating nodes>> that route requests from clients to
-the other nodes in the cluster, but you should _not_ use dedicated master nodes
-for this purpose.
-
-A small or lightly-loaded cluster may operate well if its master-eligible nodes
-have other roles and responsibilities, but once your cluster comprises more
-than a handful of nodes it usually makes sense to use dedicated master-eligible
-nodes.
-
-To create a dedicated master-eligible node, set:
-
-[source,yaml]
-------------------
-node.roles: [ master ]
-------------------
-
-[[voting-only-node]]
-===== Voting-only master-eligible node
-
-A voting-only master-eligible node is a node that participates in
-<<modules-discovery,master elections>> but which will not act as the cluster's
-elected master node. In particular, a voting-only node can serve as a tiebreaker
-in elections.
-
-It may seem confusing to use the term "master-eligible" to describe a
-voting-only node since such a node is not actually eligible to become the master
-at all. This terminology is an unfortunate consequence of history:
-master-eligible nodes are those nodes that participate in elections and perform
-certain tasks during cluster state publications, and voting-only nodes have the
-same responsibilities even if they can never become the elected master.
-
-To configure a master-eligible node as a voting-only node, include `master` and
-`voting_only` in the list of roles. For example to create a voting-only data
-node:
-
-[source,yaml]
-------------------
-node.roles: [ data, master, voting_only ]
-------------------
-
-IMPORTANT: Only nodes with the `master` role can be marked as having the
-`voting_only` role.
-
-High availability (HA) clusters require at least three master-eligible nodes, at
-least two of which are not voting-only nodes. Such a cluster will be able to
-elect a master node even if one of the nodes fails.
-
-Voting-only master-eligible nodes may also fill other roles in your cluster.
-For instance, a node may be both a data node and a voting-only master-eligible
-node. A _dedicated_ voting-only master-eligible nodes is a voting-only
-master-eligible node that fills no other roles in the cluster. To create a
-dedicated voting-only master-eligible node, set:
-
-[source,yaml]
-------------------
-node.roles: [ master, voting_only ]
-------------------
-
-Since dedicated voting-only nodes never act as the cluster's elected master,
-they may require less heap and a less powerful CPU than the true master nodes.
-However all master-eligible nodes, including voting-only nodes, are on the
-critical path for <<cluster-state-publishing,publishing cluster state
-updates>>. Cluster state updates are usually independent of
-performance-critical workloads such as indexing or searches, but they are
-involved in management activities such as index creation and rollover, mapping
-updates, and recovery after a failure. The performance characteristics of these
-activities are a function of the speed of the storage on each master-eligible
-node, as well as the reliability and latency of the network interconnections
-between the elected master node and the other nodes in the cluster. You must
-therefore ensure that the storage and networking available to the nodes in your
-cluster are good enough to meet your performance goals.
-
-[[data-node]]
-==== Data nodes
-
-Data nodes hold the shards that contain the documents you have indexed. Data
-nodes handle data related operations like CRUD, search, and aggregations.
-These operations are I/O-, memory-, and CPU-intensive. It is important to
-monitor these resources and to add more data nodes if they are overloaded.
-
-The main benefit of having dedicated data nodes is the separation of the master
-and data roles.
-
-In a multi-tier deployment architecture, you use specialized data roles to
-assign data nodes to specific tiers: `data_content`,`data_hot`, `data_warm`,
-`data_cold`, or `data_frozen`. A node can belong to multiple tiers. 
-
-If you want to include a node in all tiers, or if your cluster does not use multiple tiers, then you can use the generic `data` role.
-
-include::../how-to/shard-limits.asciidoc[]
-
-WARNING: If you assign a node to a specific tier using a specialized data role, then you shouldn't also assign it the generic `data` role. The generic `data` role takes precedence over specialized data roles.
-
-[[generic-data-node]]
-===== Generic data node
-
-Generic data nodes are included in all content tiers. 
-
-To create a dedicated generic data node, set:
-[source,yaml]
----
-node.roles: [ data ]
----
-
-[[data-content-node]]
-===== Content data node
-
-Content data nodes are part of the content tier.
-include::{es-ref-dir}/datatiers.asciidoc[tag=content-tier]
-
-To create a dedicated content node, set:
-[source,yaml]
----
-node.roles: [ data_content ]
----
-
-[[data-hot-node]]
-===== Hot data node
-
-Hot data nodes are part of the hot tier.
-include::{es-ref-dir}/datatiers.asciidoc[tag=hot-tier]
-
-To create a dedicated hot node, set:
-[source,yaml]
----
-node.roles: [ data_hot ]
----
-
-[[data-warm-node]]
-===== Warm data node
-
-Warm data nodes are part of the warm tier.
-include::{es-ref-dir}/datatiers.asciidoc[tag=warm-tier]
-
-To create a dedicated warm node, set:
-[source,yaml]
----
-node.roles: [ data_warm ]
----
-
-[[data-cold-node]]
-===== Cold data node
-
-Cold data nodes are part of the cold tier.
-include::{es-ref-dir}/datatiers.asciidoc[tag=cold-tier]
-
-To create a dedicated cold node, set:
-[source,yaml]
----
-node.roles: [ data_cold ]
----
-
-[[data-frozen-node]]
-===== Frozen data node
-
-Frozen data nodes are part of the frozen tier.
-include::{es-ref-dir}/datatiers.asciidoc[tag=frozen-tier]
-
-To create a dedicated frozen node, set:
-[source,yaml]
----
-node.roles: [ data_frozen ]
----
-
-[[node-ingest-node]]
-==== Ingest node
-
-Ingest nodes can execute pre-processing pipelines, composed of one or more
-ingest processors. Depending on the type of operations performed by the ingest
-processors and the required resources, it may make sense to have dedicated
-ingest nodes, that will only perform this specific task.
-
-To create a dedicated ingest node, set:
-
-[source,yaml]
----
-node.roles: [ ingest ]
----
-
-[[coordinating-only-node]]
-==== Coordinating only node
-
-If you take away the ability to be able to handle master duties, to hold data,
-and pre-process documents, then you are left with a _coordinating_ node that
-can only route requests, handle the search reduce phase, and distribute bulk
-indexing. Essentially, coordinating only nodes behave as smart load balancers.
-
-Coordinating only nodes can benefit large clusters by offloading the
-coordinating node role from data and master-eligible nodes. They join the
-cluster and receive the full <<cluster-state,cluster state>>, like every other
-node, and they use the cluster state to route requests directly to the
-appropriate place(s).
-
-WARNING: Adding too many coordinating only nodes to a cluster can increase the
-burden on the entire cluster because the elected master node must await
-acknowledgement of cluster state updates from every node! The benefit of
-coordinating only nodes should not be overstated -- data nodes can happily
-serve the same purpose.
-
-To create a dedicated coordinating node, set:
-
-[source,yaml]
----
-node.roles: [ ]
----
-
-[[remote-node]]
-==== Remote-eligible node
-
-A remote-eligible node acts as a cross-cluster client and connects to
-<<remote-clusters,remote clusters>>. Once connected, you can search
-remote clusters using <<modules-cross-cluster-search,{ccs}>>. You can also sync
-data between clusters using <<xpack-ccr,{ccr}>>.
-
-[source,yaml]
----
-node.roles: [ remote_cluster_client ]
----
-
-[[ml-node]]
-==== [xpack]#Machine learning node#
-
-{ml-cap} nodes run jobs and handle {ml} API requests. For more information, see
-<<ml-settings>>.
-
-To create a dedicated {ml} node, set:
-
-[source,yaml]
----
-node.roles: [ ml, remote_cluster_client]
----
-
-The `remote_cluster_client` role is optional but strongly recommended.
-Otherwise, {ccs} fails when used in {ml} jobs or {dfeeds}. If you use {ccs} in
-your {anomaly-jobs}, the `remote_cluster_client` role is also required on all
-master-eligible nodes. Otherwise, the {dfeed} cannot start. See <<remote-node>>.
-
-[[transform-node]]
-==== [xpack]#{transform-cap} node#
-
-{transform-cap} nodes run {transforms} and handle {transform} API requests. For
-more information, see <<transform-settings>>.
-
-To create a dedicated {transform} node, set:
-
-[source,yaml]
----
-node.roles: [ transform, remote_cluster_client ]
----
-
-The `remote_cluster_client` role is optional but strongly recommended.
-Otherwise, {ccs} fails when used in {transforms}. See <<remote-node>>.
-
-[[change-node-role]]
-==== Changing the role of a node
-
-Each data node maintains the following data on disk:
-
-* the shard data for every shard allocated to that node,
-* the index metadata corresponding with every shard allocated to that node, and
-* the cluster-wide metadata, such as settings and index templates.
-
-Similarly, each master-eligible node maintains the following data on disk:
-
-* the index metadata for every index in the cluster, and
-* the cluster-wide metadata, such as settings and index templates.
-
-Each node checks the contents of its data path at startup. If it discovers
-unexpected data then it will refuse to start. This is to avoid importing
-unwanted <<dangling-indices,dangling indices>> which can lead
-to a red cluster health. To be more precise, nodes without the `data` role will
-refuse to start if they find any shard data on disk at startup, and nodes
-without both the `master` and `data` roles will refuse to start if they have any
-index metadata on disk at startup.
-
-It is possible to change the roles of a node by adjusting its
-`elasticsearch.yml` file and restarting it. This is known as _repurposing_ a
-node. In order to satisfy the checks for unexpected data described above, you
-must perform some extra steps to prepare a node for repurposing when starting
-the node without the `data` or `master` roles.
-
-* If you want to repurpose a data node by removing the `data` role then you
-  should first use an <<cluster-shard-allocation-filtering,allocation filter>> to safely
-  migrate all the shard data onto other nodes in the cluster.
-
-* If you want to repurpose a node to have neither the `data` nor `master` roles
-  then it is simplest to start a brand-new node with an empty data path and the
-  desired roles. You may find it safest to use an
-  <<cluster-shard-allocation-filtering,allocation filter>> to migrate the shard data elsewhere
-  in the cluster first.
-
-If it is not possible to follow these extra steps then you may be able to use
-the <<node-tool-repurpose,`elasticsearch-node repurpose`>> tool to delete any
-excess data that prevents a node from starting.
+To learn more about the available node roles, see <<node-roles-overview>>.

 [discrete]
 === Node data path settings
@ -495,6 +125,25 @@ modify the contents of the data directory. The data directory contains no
 executables so a virus scan will only find false positives.
 // end::modules-node-data-path-warning-tag[]

+[[custom-node-attributes]]
+==== Custom node attributes
+
+If needed, you can add custom attributes to a node. These attributes can be used to <<cluster-routing-settings,filter which nodes a shard can be allocated to>>, or to group nodes together for <<shard-allocation-awareness,shard allocation awareness>>.
+
+[TIP]
+===============================================
+You can also set a node attribute using the `-E` command line argument when you start a node:
+
+[source,sh]
+--------------------------------------------------------
+./bin/elasticsearch -Enode.attr.rack_id=rack_one
+--------------------------------------------------------
+===============================================
+
+`node.attr.<attribute-name>`::
+      (<<dynamic-cluster-setting,Dynamic>>)
+      A custom attribute that you can assign to a node. For example, you might assign a `rack_id` attribute to each node to ensure that primary and replica shards are not allocated on the same rack. You can specify multiple attributes as a comma-separated list.
+
 [discrete]
 [[other-node-settings]]
 === Other node settings
@ -504,4 +153,4 @@ including:

 * <<cluster-name,`cluster.name`>>
 * <<node-name,`node.name`>>
-* <<modules-network,network settings>>
+* <<modules-network,network settings>>
--- a/docs/reference/modules/remote-clusters.asciidoc
+++ b/docs/reference/modules/remote-clusters.asciidoc
@ -80,7 +80,7 @@ The _gateway nodes_ selection depends on the following criteria:
 +
 * *version*: Remote nodes must be compatible with the cluster they are
 registered to.
-* *role*: By default, any non-<<master-node,master-eligible>> node can act as a
+* *role*: By default, any non-<<master-node-role,master-eligible>> node can act as a
 gateway node. Dedicated master nodes are never selected as gateway nodes.
 * *attributes*: You can define the gateway nodes for a cluster by setting
 <<cluster-remote-node-attr,`cluster.remote.node.attr.gateway`>> to `true`.
--- a/docs/reference/modules/shard-ops.asciidoc
+++ b/docs/reference/modules/shard-ops.asciidoc
@ -25,7 +25,7 @@ By default, the primary and replica shard copies for an index can be allocated t

 You can control how shard copies are allocated using the following settings:

- <<modules-cluster,Cluster-level shard allocation settings>>: Use these settings to control how shard copies are allocated and balanced across the entire cluster. For example, you might want to allocate nodes availability zones, or prevent certain nodes from being used so you can perform maintenance.
+- <<modules-cluster,Cluster-level shard allocation settings>>: Use these settings to control how shard copies are allocated and balanced across the entire cluster. For example, you might want to <<shard-allocation-awareness,allocate nodes availability zones>>, or prevent certain nodes from being used so you can perform maintenance.

 - <<index-modules-allocation,Index-level shard allocation settings>>: Use these settings to control how the shard copies for a specific index are allocated. For example, you might want to allocate an index to a node in a specific data tier, or to an node with specific attributes.

@ -80,4 +80,4 @@ When a shard copy is relocated, it is created as a new shard copy on the target

 You can control how and when shard copies are relocated. For example, you can adjust the rebalancing settings that control when shard copies are relocated to balance the cluster, or the high watermark for disk-based shard allocation that can trigger relocation. These settings are part of the <<modules-cluster,cluster-level shard allocation settings>>.

-Shard relocation operations also respect shard allocation and recovery settings. 
+Shard relocation operations also respect shard allocation and recovery settings.