diff --git a/docs/reference/images/troubleshooting/disk/autoscaling_banner.png b/docs/reference/images/troubleshooting/disk/autoscaling_banner.png new file mode 100644 index 000000000000..cffe32385107 Binary files /dev/null and b/docs/reference/images/troubleshooting/disk/autoscaling_banner.png differ diff --git a/docs/reference/images/troubleshooting/disk/autoscaling_limits_banner.png b/docs/reference/images/troubleshooting/disk/autoscaling_limits_banner.png new file mode 100644 index 000000000000..6eabffe81abb Binary files /dev/null and b/docs/reference/images/troubleshooting/disk/autoscaling_limits_banner.png differ diff --git a/docs/reference/images/troubleshooting/disk/enable_autoscaling.png b/docs/reference/images/troubleshooting/disk/enable_autoscaling.png new file mode 100644 index 000000000000..2a31cfda4b66 Binary files /dev/null and b/docs/reference/images/troubleshooting/disk/enable_autoscaling.png differ diff --git a/docs/reference/images/troubleshooting/disk/increase-disk-capacity-master-node.png b/docs/reference/images/troubleshooting/disk/increase-disk-capacity-master-node.png new file mode 100644 index 000000000000..5ebbd4e93e65 Binary files /dev/null and b/docs/reference/images/troubleshooting/disk/increase-disk-capacity-master-node.png differ diff --git a/docs/reference/images/troubleshooting/disk/increase-disk-capacity-other-node.png b/docs/reference/images/troubleshooting/disk/increase-disk-capacity-other-node.png new file mode 100644 index 000000000000..e6e98bee45e3 Binary files /dev/null and b/docs/reference/images/troubleshooting/disk/increase-disk-capacity-other-node.png differ diff --git a/docs/reference/images/troubleshooting/disk/reached_autoscaling_limits.png b/docs/reference/images/troubleshooting/disk/reached_autoscaling_limits.png new file mode 100644 index 000000000000..28dde50d0ad6 Binary files /dev/null and b/docs/reference/images/troubleshooting/disk/reached_autoscaling_limits.png differ diff --git a/docs/reference/images/troubleshooting/disk/reduce_replicas.png b/docs/reference/images/troubleshooting/disk/reduce_replicas.png new file mode 100644 index 000000000000..ae3d49024437 Binary files /dev/null and b/docs/reference/images/troubleshooting/disk/reduce_replicas.png differ diff --git a/docs/reference/settings/health-diagnostic-settings.asciidoc b/docs/reference/settings/health-diagnostic-settings.asciidoc index ec10157406b9..690b39987b74 100644 --- a/docs/reference/settings/health-diagnostic-settings.asciidoc +++ b/docs/reference/settings/health-diagnostic-settings.asciidoc @@ -16,7 +16,7 @@ is not recommended to change any of these from their default values. a master at all, before moving on with other checks. Defaults to `30s` (30 seconds). `master_history.max_age`:: -(<>) The timeframe we record the master history +(<>) The timeframe we record the master history to be used for diagnosing the cluster health. Master node changes older than this time will not be considered when diagnosing the cluster health. Defaults to `30m` (30 minutes). @@ -27,3 +27,11 @@ Defaults to `4`. `health.master_history.no_master_transitions_threshold`:: (<>) The number of transitions to no master witnessed by a node that indicates the cluster is not healthy. Defaults to `4`. + +`health.node.enabled`:: +(<>) Enables the health node, which allows the health API to provide indications about +cluster wide health aspects such as disk space. + +`health.reporting.local.monitor.interval`:: +(<>) Determines the interval in which each node of the cluster monitors aspects that +comprise its local health such as its disk usage. diff --git a/docs/reference/tab-widgets/troubleshooting/disk/decrease-data-node-disk-usage-widget.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/decrease-data-node-disk-usage-widget.asciidoc new file mode 100644 index 000000000000..28ebe81a3906 --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/decrease-data-node-disk-usage-widget.asciidoc @@ -0,0 +1,40 @@ +++++ +
+
+ + +
+
+++++ + +include::decrease-data-node-disk-usage.asciidoc[tag=cloud] + +++++ +
+ +
+++++ diff --git a/docs/reference/tab-widgets/troubleshooting/disk/decrease-data-node-disk-usage.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/decrease-data-node-disk-usage.asciidoc new file mode 100644 index 000000000000..c8fb00cc708e --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/decrease-data-node-disk-usage.asciidoc @@ -0,0 +1,140 @@ +// tag::cloud[] +**Use {kib}** + +//tag::kibana-api-ex[] +. Log in to the {ess-console}[{ecloud} console]. ++ + +. On the **Elasticsearch Service** panel, click the name of your deployment. ++ + +NOTE: If the name of your deployment is disabled your {kib} instances might be +unhealthy, in which case please contact https://support.elastic.co[Elastic Support]. +If your deployment doesn't include {kib}, all you need to do is +{cloud}/ec-access-kibana.html[enable it first]. ++ +. Open your deployment's side navigation menu (placed under the Elastic logo in the upper left corner) +and go to **Stack Management > Index Management**. + +. In the list of all your indices, click the `Replicas` column twice to sort the indices based on their number of +replicas starting with the one that has the most. Go through the indices and pick one by one the index with the +least importance and higher number of replicas. ++ +WARNING: Reducing the replicas of an index can potentially reduce search throughput and data redundancy. ++ +. For each index you chose, click on its name, then on the panel that appears click `Edit settings`, reduce the +value of the `index.number_of_replicas` to the desired value and then click `Save`. ++ +[role="screenshot"] +image::images/troubleshooting/disk/reduce_replicas.png[Reducing replicas,align="center"] ++ +. Continue this process until the cluster is healthy again. + +// end::cloud[] + +// tag::self-managed[] +In order to estimate how many replicas need to be removed, first you need to estimate the amount of disk space that +needs to be released. + +. First, retrieve the relevant disk thresholds that will indicate how much space should be released. The +relevant thresholds are the <> for all the tiers apart from the frozen +one and the <> for the frozen tier. The following +example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark: ++ +[source,console] +---- +GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* +---- ++ +The response will look like this: ++ +[source,console-result] +---- +{ + "defaults": { + "cluster": { + "routing": { + "allocation": { + "disk": { + "watermark": { + "high": "90%", + "high.max_headroom": "150GB" + } + } + } + } + } + } +} +---- +// TEST[skip:illustration purposes only] ++ +The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have +more than 150GB available, read more on how this threshold works <>. + +. The next step is to find out the current disk usage; this will indicate how much space should be freed. For simplicity, +our example has one node, but you can apply the same for every node over the relevant threshold. ++ +[source,console] +---- +GET _cat/allocation?v&s=disk.avail&h=node,disk.percent,disk.avail,disk.total,disk.used,disk.indices,shards +---- ++ +The response will look like this: ++ +[source,console-result] +---- +node disk.percent disk.avail disk.total disk.used disk.indices shards +instance-0000000000 91 4.6gb 35gb 31.1gb 29.9gb 111 +---- +// TEST[skip:illustration purposes only] + +. The high watermark configuration indicates that the disk usage needs to drop below 90%. Consider allowing some +padding, so the node will not go over the threshold in the near future. In this example, let's release approximately 7GB. + +. The next step is to list all the indices and choose which replicas to reduce. ++ +NOTE: The following command orders the indices with descending number of replicas and primary store size. We do this to +help you choose which replicas to reduce under the assumption that the more replicas you have the smaller the risk if +you remove a copy and the bigger the replica the more space will be released. This does not take into consideration any +functional requirements, so please see it as a mere suggestion. ++ +[source,console] +---- +GET _cat/indices?v&s=rep:desc,pri.store.size:desc&h=health,index,pri,rep,store.size,pri.store.size +---- ++ +The response will look like: ++ +[source,console-result] +---- +health index pri rep store.size pri.store.size +green my_index 2 3 9.9gb 3.3gb +green my_other_index 2 3 1.8gb 470.3mb +green search-products 2 3 278.5kb 69.6kb +green logs-000001 1 0 7.7gb 7.7gb +---- +// TEST[skip:illustration purposes only] ++ +. In the list above we see that if we reduce the replicas to 1 of the indices `my_index` and `my_other_index` we will +release the required disk space. It is not necessary to reduce the replicas of `search-products` and `logs-000001` does +not have any replicas anyway. Reduce the replicas of one or more indices with the <>: ++ +WARNING: Reducing the replicas of an index can potentially reduce search throughput and data redundancy. ++ +[source,console] +---- +PUT my_index,my_other_index/_settings +{ + "index.number_of_replicas": 1 +} +---- +// TEST[skip:illustration purposes only] +// end::self-managed[] + +IMPORTANT: After reducing the replicas please consider there are enough replicas to ensure your search +performance and reliability requirements. If not, at your earliest convenience (i) consider using +<> to manage more efficiently the +retention of your timeseries data, or (ii) reduce the amount of data you have by disabling the `source` or removing +less important data, or (iii) increase your disk capacity. diff --git a/docs/reference/tab-widgets/troubleshooting/disk/increase-data-node-capacity-widget.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/increase-data-node-capacity-widget.asciidoc new file mode 100644 index 000000000000..0f6eef5c8f3b --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/increase-data-node-capacity-widget.asciidoc @@ -0,0 +1,40 @@ +++++ +
+
+ + +
+
+++++ + +include::increase-data-node-capacity.asciidoc[tag=cloud] + +++++ +
+ +
+++++ diff --git a/docs/reference/tab-widgets/troubleshooting/disk/increase-data-node-capacity.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/increase-data-node-capacity.asciidoc new file mode 100644 index 000000000000..916856a20d7d --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/increase-data-node-capacity.asciidoc @@ -0,0 +1,110 @@ +// tag::cloud[] +In order to increase the disk capacity of the data nodes in your cluster: + +. Log in to the {ess-console}[{ecloud} console]. ++ +. On the **Elasticsearch Service** panel, click the gear under the `Manage deployment` column that corresponds to the +name of your deployment. ++ +. If autoscaling is available but not enabled, please enable it. You can do this by clicking the button +`Enable autoscaling` on a banner like the one below: ++ +[role="screenshot"] +image::images/troubleshooting/disk/autoscaling_banner.png[Autoscaling banner,align="center"] ++ +Or you can go to `Actions > Edit deployment`, check the checkbox `Autoscale` and click `save` at the bottom of the page. ++ +[role="screenshot"] +image::images/troubleshooting/disk/enable_autoscaling.png[Enabling autoscaling,align="center"] + +. If autoscaling has succeeded the cluster should return to `healthy` status. If the cluster is still out of disk, +please check if autoscaling has reached its limits. You will be notified about this by the following banner: ++ +[role="screenshot"] +image::images/troubleshooting/disk/autoscaling_limits_banner.png[Autoscaling banner,align="center"] ++ +or you can go to `Actions > Edit deployment` and look for the label `LIMIT REACHED` as shown below: ++ +[role="screenshot"] +image::images/troubleshooting/disk/reached_autoscaling_limits.png[Autoscaling limits reached,align="center"] ++ +If you are seeing the banner click `Update autoscaling settings` to go to the `Edit` page. Otherwise, you are already +in the `Edit` page, click `Edit settings` to increase the autoscaling limits. After you perform the change click `save` +at the bottom of the page. + +// end::cloud[] + +// tag::self-managed[] +In order to increase the data node capacity in your cluster, you will need to calculate the amount of extra disk space +needed. + +. First, retrieve the relevant disk thresholds that will indicate how much space should be available. The +relevant thresholds are the <> for all the tiers apart from the frozen +one and the <> for the frozen tier. The following +example demonstrates disk shortage in the hot tier, so we will only retrieve the high watermark: ++ +[source,console] +---- +GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* +---- ++ +The response will look like this: ++ +[source,console-result] +---- +{ + "defaults": { + "cluster": { + "routing": { + "allocation": { + "disk": { + "watermark": { + "high": "90%", + "high.max_headroom": "150GB" + } + } + } + } + } + } +} +---- +// TEST[skip:illustration purposes only] ++ +The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have +more than 150GB available, read more on how this threshold works <>. + +. The next step is to find out the current disk usage, this will indicate how much extra space is needed. For simplicity, +our example has one node, but you can apply the same for every node over the relevant threshold. ++ +[source,console] +---- +GET _cat/allocation?v&s=disk.avail&h=node,disk.percent,disk.avail,disk.total,disk.used,disk.indices,shards +---- ++ +The response will look like this: ++ +[source,console-result] +---- +node disk.percent disk.avail disk.total disk.used disk.indices shards +instance-0000000000 91 4.6gb 35gb 31.1gb 29.9gb 111 +---- +// TEST[skip:illustration purposes only] + +. The high watermark configuration indicates that the disk usage needs to drop below 90%. To achieve this, 2 +things are possible: +- to add an extra data node to the cluster (this requires that you have more than one shard in your cluster), or +- to extend the disk space of the current node by approximately 20% to allow this node to drop to 70%. This will give +enough space to this node to not run out of space soon. + +. In the case of adding another data node, the cluster will not recover immediately. It might take some time to +relocate some shards to the new node. You can check the progress here: ++ +[source,console] +---- +GET /_cat/shards?v&h=state,node&s=state +---- ++ +If in the response the shards' state is `RELOCATING`, it means that shards are still moving. Wait until all shards turn +to `STARTED` or until the health disk indicator turns to `green`. +// end::self-managed[] diff --git a/docs/reference/tab-widgets/troubleshooting/disk/increase-master-node-capacity-widget.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/increase-master-node-capacity-widget.asciidoc new file mode 100644 index 000000000000..eb6b01dbb44b --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/increase-master-node-capacity-widget.asciidoc @@ -0,0 +1,40 @@ +++++ +
+
+ + +
+
+++++ + +include::increase-master-node-capacity.asciidoc[tag=cloud] + +++++ +
+ +
+++++ diff --git a/docs/reference/tab-widgets/troubleshooting/disk/increase-master-node-capacity.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/increase-master-node-capacity.asciidoc new file mode 100644 index 000000000000..9b2419ddf866 --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/increase-master-node-capacity.asciidoc @@ -0,0 +1,89 @@ +// tag::cloud[] + +. Log in to the {ess-console}[{ecloud} console]. ++ +. On the **Elasticsearch Service** panel, click the gear under the `Manage deployment` column that corresponds to the +name of your deployment. ++ +. Go to `Actions > Edit deployment` and then go to the `Master instances` section: ++ +[role="screenshot"] +image::images/troubleshooting/disk/increase-disk-capacity-master-node.png[Increase disk capacity of master nodes,align="center"] + +. Choose a larger than the pre-selected capacity configuration from the drop-down menu and click `save`. Wait for +the plan to be applied and the problem should be resolved. + +// end::cloud[] + +// tag::self-managed[] +In order to increase the disk capacity of a master node, you will need to replace *all* the master nodes with +master nodes of higher disk capacity. + +. First, retrieve the disk threshold that will indicate how much disk space is needed. The relevant threshold is +the <> and can be retrieved via the following command: ++ +[source,console] +---- +GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* +---- ++ +The response will look like this: ++ +[source,console-result] +---- +{ + "defaults": { + "cluster": { + "routing": { + "allocation": { + "disk": { + "watermark": { + "high": "90%", + "high.max_headroom": "150GB" + } + } + } + } + } + } +---- +// TEST[skip:illustration purposes only] ++ +The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have +more than 150GB available, read more how this threshold works <>. + +. The next step is to find out the current disk usage, this will allow to calculate how much extra space is needed. +In the following example, we show only the master nodes for readability purposes: ++ +[source,console] +---- +GET /_cat/nodes?v&h=name,master,node.role,disk.used_percent,disk.used,disk.avail,disk.total +---- ++ +The response will look like this: ++ +[source,console-result] +---- +name master node.role disk.used_percent disk.used disk.avail disk.total +instance-0000000000 * m 85.31 3.4gb 500mb 4gb +instance-0000000001 * m 50.02 2.1gb 1.9gb 4gb +instance-0000000002 * m 50.02 1.9gb 2.1gb 4gb +---- +// TEST[skip:illustration purposes only] + +. The desired situation is to drop the disk usages below the relevant threshold, in our example 90%. Consider adding +some padding, so it will not go over the threshold soon. If you have multiple master nodes you need to ensure that *all* +master nodes will have this capacity. Assuming you have the new nodes ready, follow the next three steps for every +master node. + +. Bring down one of the master nodes. +. Start up one of the new master nodes and wait for it to join the cluster. You can check this via: ++ +[source,console] +---- +GET /_cat/nodes?v&h=name,master,node.role,disk.used_percent,disk.used,disk.avail,disk.total +---- ++ +. Only after you have confirmed that your cluster has the initial number of master nodes, move forward to the next one +until all the initial master nodes have been replaced. +// end::self-managed[] diff --git a/docs/reference/tab-widgets/troubleshooting/disk/increase-other-node-capacity-widget.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/increase-other-node-capacity-widget.asciidoc new file mode 100644 index 000000000000..2f2a864b45b0 --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/increase-other-node-capacity-widget.asciidoc @@ -0,0 +1,40 @@ +++++ +
+
+ + +
+
+++++ + +include::increase-other-node-capacity.asciidoc[tag=cloud] + +++++ +
+ +
+++++ diff --git a/docs/reference/tab-widgets/troubleshooting/disk/increase-other-node-capacity.asciidoc b/docs/reference/tab-widgets/troubleshooting/disk/increase-other-node-capacity.asciidoc new file mode 100644 index 000000000000..8895ed54c7c1 --- /dev/null +++ b/docs/reference/tab-widgets/troubleshooting/disk/increase-other-node-capacity.asciidoc @@ -0,0 +1,94 @@ +// tag::cloud[] + +. Log in to the {ess-console}[{ecloud} console]. ++ +. On the **Elasticsearch Service** panel, click the gear under the `Manage deployment` column that corresponds to the +name of your deployment. ++ +. Go to `Actions > Edit deployment` and then go to the `Coordinating instances` or the `Machine Learning instances` +section depending on the roles listed in the diagnosis: ++ +[role="screenshot"] +image::images/troubleshooting/disk/increase-disk-capacity-other-node.png[Increase disk capacity of other nodes,align="center"] + +. Choose a larger than the pre-selected capacity configuration from the drop-down menu and click `save`. Wait for +the plan to be applied and the problem should be resolved. + +// end::cloud[] + +// tag::self-managed[] +In order to increase the disk capacity of any other node, you will need to replace the instance that has run out of +space with one of higher disk capacity. + +. First, retrieve the disk threshold that will indicate how much disk space is needed. The relevant threshold is +the <> and can be retrieved via the following command: ++ +[source,console] +---- +GET _cluster/settings?include_defaults&filter_path=*.cluster.routing.allocation.disk.watermark.high* +---- ++ +The response will look like this: ++ +[source,console-result] +---- +{ + "defaults": { + "cluster": { + "routing": { + "allocation": { + "disk": { + "watermark": { + "high": "90%", + "high.max_headroom": "150GB" + } + } + } + } + } + } +---- +// TEST[skip:illustration purposes only] ++ +The above means that in order to resolve the disk shortage we need to either drop our disk usage below the 90% or have +more than 150GB available, read more how this threshold works <>. + +. The next step is to find out the current disk usage, this will allow to calculate how much extra space is needed. +In the following example, we show only a machine learning node for readability purposes: ++ +[source,console] +---- +GET /_cat/nodes?v&h=name,node.role,disk.used_percent,disk.used,disk.avail,disk.total +---- ++ +The response will look like this: ++ +[source,console-result] +---- +name node.role disk.used_percent disk.used disk.avail disk.total +instance-0000000000 l 85.31 3.4gb 500mb 4gb +---- +// TEST[skip:illustration purposes only] + +. The desired situation is to drop the disk usage below the relevant threshold, in our example 90%. Consider adding +some padding, so it will not go over the threshold soon. Assuming you have the new node ready, add this node to the +cluster. + +. Verify that the new node has joined the cluster: ++ +[source,console] +---- +GET /_cat/nodes?v&h=name,node.role,disk.used_percent,disk.used,disk.avail,disk.total +---- ++ +The response will look like this: ++ +[source,console-result] +---- +name node.role disk.used_percent disk.used disk.avail disk.total +instance-0000000000 l 85.31 3.4gb 500mb 4gb +instance-0000000001 l 41.31 3.4gb 4.5gb 8gb +---- +// TEST[skip:illustration purposes only] +. Now you can remove the out of disk space instance. +// end::self-managed[] diff --git a/docs/reference/troubleshooting.asciidoc b/docs/reference/troubleshooting.asciidoc index b5e40408cdeb..773ee07ef579 100644 --- a/docs/reference/troubleshooting.asciidoc +++ b/docs/reference/troubleshooting.asciidoc @@ -31,6 +31,13 @@ fix problems that an {es} deployment might encounter. * <> * <> +[discrete] +[[troubleshooting-capacity]] +=== Capacity +* <> +* <> +* <> + [discrete] [[troubleshooting-snapshot]] === Snapshot and restore @@ -90,6 +97,12 @@ include::troubleshooting/data/increase-cluster-shard-limit.asciidoc[] include::troubleshooting/corruption-issues.asciidoc[] +include::troubleshooting/disk/fix-data-node-out-of-disk.asciidoc[] + +include::troubleshooting/disk/fix-master-node-out-of-disk.asciidoc[] + +include::troubleshooting/disk/fix-other-node-out-of-disk.asciidoc[] + include::troubleshooting/data/start-ilm.asciidoc[] include::troubleshooting/data/start-slm.asciidoc[] diff --git a/docs/reference/troubleshooting/disk/fix-data-node-out-of-disk.asciidoc b/docs/reference/troubleshooting/disk/fix-data-node-out-of-disk.asciidoc new file mode 100644 index 000000000000..b98b78582900 --- /dev/null +++ b/docs/reference/troubleshooting/disk/fix-data-node-out-of-disk.asciidoc @@ -0,0 +1,23 @@ +[[fix-data-node-out-of-disk]] +== Fix data nodes out of disk + +{es} is using data nodes to distribute your data inside the cluster. If one or more of these nodes are running +out of space, {es} takes action to redistribute your data within the nodes so all nodes have enough available +disk space. If {es} cannot facilitate enough available space in a node, then you will need to intervene in one +of two ways: + +. <> +. <> + +[[increase-capacity-data-node]] +=== Increase the disk capacity of data nodes +include::{es-repo-dir}/tab-widgets/troubleshooting/disk/increase-data-node-capacity-widget.asciidoc[] + +[[decrease-disk-usage-data-node]] +=== Decrease the disk usage of data nodes +In order to decrease the disk usage in your cluster without losing any data, you can try reducing the replicas of indices. + +NOTE: Reducing the replicas of an index can potentially reduce search throughput and data redundancy. However, it +can quickly give the cluster breathing room until a more permanent solution is in place. + +include::{es-repo-dir}/tab-widgets/troubleshooting/disk/decrease-data-node-disk-usage-widget.asciidoc[] diff --git a/docs/reference/troubleshooting/disk/fix-master-node-out-of-disk.asciidoc b/docs/reference/troubleshooting/disk/fix-master-node-out-of-disk.asciidoc new file mode 100644 index 000000000000..6a32ab8aff37 --- /dev/null +++ b/docs/reference/troubleshooting/disk/fix-master-node-out-of-disk.asciidoc @@ -0,0 +1,8 @@ +[[fix-master-node-out-of-disk]] +== Fix master nodes out of disk + +{es} is using master nodes to coordinate the cluster. If the master or any master eligible nodes are running +out of space, you need to ensure that they have enough disk space to function. If the <> +reports that your master node is out of space you need to increase the disk capacity of your master nodes. + +include::{es-repo-dir}/tab-widgets/troubleshooting/disk/increase-master-node-capacity-widget.asciidoc[] diff --git a/docs/reference/troubleshooting/disk/fix-other-node-out-of-disk.asciidoc b/docs/reference/troubleshooting/disk/fix-other-node-out-of-disk.asciidoc new file mode 100644 index 000000000000..d53e5337fad8 --- /dev/null +++ b/docs/reference/troubleshooting/disk/fix-other-node-out-of-disk.asciidoc @@ -0,0 +1,9 @@ +[[fix-other-node-out-of-disk]] +== Fix other role nodes out of disk + +{es} can use dedicated nodes to execute other functions apart from storing data or coordinating the cluster, +for example machine learning. If one or more of these nodes are running out of space, you need to ensure that they have +enough disk space to function. If the <> reports that a node that is not a master and does not +contain data is out of space you need to increase the disk capacity of this node. + +include::{es-repo-dir}/tab-widgets/troubleshooting/disk/increase-other-node-capacity-widget.asciidoc[]