[DOCS] Add API example + diagrams to shard allocation awareness docs (#108390)

This commit is contained in:
shainaraskas 2024-05-08 12:52:50 -04:00 committed by GitHub
parent 616e71963e
commit 9d9f23ca96
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 38 additions and 17 deletions

View file

@ -7,14 +7,14 @@ nodes to take over their responsibilities, an {es} cluster can continue
operating normally if some of its nodes are unavailable or disconnected.
There is a limit to how small a resilient cluster can be. All {es} clusters
require:
require the following components to function:
- One <<modules-discovery-quorums,elected master node>> node
- At least one node for each <<modules-node,role>>.
- At least one copy of every <<scalability,shard>>.
- One <<modules-discovery-quorums,elected master node>>
- At least one node for each <<modules-node,role>>
- At least one copy of every <<scalability,shard>>
A resilient cluster requires redundancy for every required cluster component.
This means a resilient cluster must have:
This means a resilient cluster must have the following components:
- At least three master-eligible nodes
- At least two nodes of each role
@ -375,11 +375,11 @@ The cluster will be resilient to the loss of any zone as long as:
- There are at least two zones containing data nodes.
- Every index that is not a <<searchable-snapshots,searchable snapshot index>>
has at least one replica of each shard, in addition to the primary.
- Shard allocation awareness is configured to avoid concentrating all copies of
a shard within a single zone.
- <<shard-allocation-awareness,Shard allocation awareness>> is configured to
avoid concentrating all copies of a shard within a single zone.
- The cluster has at least three master-eligible nodes. At least two of these
nodes are not voting-only master-eligible nodes, and they are spread evenly
across at least three zones.
nodes are not <<voting-only-node,voting-only master-eligible nodes>>,
and they are spread evenly across at least three zones.
- Clients are configured to send their requests to nodes in more than one zone
or are configured to use a load balancer that balances the requests across an
appropriate set of nodes. The {ess-trial}[Elastic Cloud] service provides such

Binary file not shown.

After

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

View file

@ -5,7 +5,7 @@ You can use custom node attributes as _awareness attributes_ to enable {es}
to take your physical hardware configuration into account when allocating shards.
If {es} knows which nodes are on the same physical server, in the same rack, or
in the same zone, it can distribute the primary shard and its replica shards to
minimise the risk of losing all shard copies in the event of a failure.
minimize the risk of losing all shard copies in the event of a failure.
When shard allocation awareness is enabled with the
<<dynamic-cluster-setting,dynamic>>
@ -19,22 +19,27 @@ allocated in each location. If the number of nodes in each location is
unbalanced and there are a lot of replicas, replica shards might be left
unassigned.
TIP: Learn more about <<high-availability-cluster-design-large-clusters,designing resilient clusters>>.
[[enabling-awareness]]
===== Enabling shard allocation awareness
To enable shard allocation awareness:
. Specify the location of each node with a custom node attribute. For example,
if you want Elasticsearch to distribute shards across different racks, you might
set an awareness attribute called `rack_id` in each node's `elasticsearch.yml`
config file.
. Specify the location of each node with a custom node attribute. For example,
if you want Elasticsearch to distribute shards across different racks, you might
use an awareness attribute called `rack_id`.
+
You can set custom attributes in two ways:
- By editing the `elasticsearch.yml` config file:
+
[source,yaml]
--------------------------------------------------------
node.attr.rack_id: rack_one
--------------------------------------------------------
+
You can also set custom attributes when you start a node:
- Using the `-E` command line argument when you start a node:
+
[source,sh]
--------------------------------------------------------
@ -56,17 +61,33 @@ cluster.routing.allocation.awareness.attributes: rack_id <1>
+
You can also use the
<<cluster-update-settings,cluster-update-settings>> API to set or update
a cluster's awareness attributes.
a cluster's awareness attributes:
+
[source,console]
--------------------------------------------------
PUT /_cluster/settings
{
"persistent" : {
"cluster.routing.allocation.awareness.attributes" : "rack_id"
}
}
--------------------------------------------------
With this example configuration, if you start two nodes with
`node.attr.rack_id` set to `rack_one` and create an index with 5 primary
shards and 1 replica of each primary, all primaries and replicas are
allocated across the two nodes.
allocated across the two node.
.All primaries and replicas allocated across two nodes in the same rack
image::images/shard-allocation/shard-allocation-awareness-one-rack.png[All primaries and replicas are allocated across two nodes in the same rack]
If you add two nodes with `node.attr.rack_id` set to `rack_two`,
{es} moves shards to the new nodes, ensuring (if possible)
that no two copies of the same shard are in the same rack.
.Primaries and replicas allocated across four nodes in two racks, with no two copies of the same shard in the same rack
image::images/shard-allocation/shard-allocation-awareness-two-racks.png[Primaries and replicas are allocated across four nodes in two racks with no two copies of the same shard in the same rack]
If `rack_two` fails and takes down both its nodes, by default {es}
allocates the lost shard copies to nodes in `rack_one`. To prevent multiple
copies of a particular shard from being allocated in the same location, you can