mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 07:37:19 -04:00
440 lines
14 KiB
Text
440 lines
14 KiB
Text
[[fix-common-cluster-issues]]
|
||
== Fix common cluster issues
|
||
|
||
This guide describes how to fix common problems with {es} clusters.
|
||
|
||
[discrete]
|
||
[[circuit-breaker-errors]]
|
||
=== Circuit breaker errors
|
||
|
||
{es} uses <<circuit-breaker,circuit breakers>> to prevent nodes from running out
|
||
of JVM heap memory. If Elasticsearch estimates an operation would exceed a
|
||
circuit breaker, it stops the operation and returns an error.
|
||
|
||
By default, the <<parent-circuit-breaker,parent circuit breaker>> triggers at
|
||
95% JVM memory usage. To prevent errors, we recommend taking steps to reduce
|
||
memory pressure if usage consistently exceeds 85%.
|
||
|
||
[discrete]
|
||
[[diagnose-circuit-breaker-errors]]
|
||
==== Diagnose circuit breaker errors
|
||
|
||
**Error messages**
|
||
|
||
If a request triggers a circuit breaker, {es} returns an error.
|
||
|
||
[source,js]
|
||
----
|
||
{
|
||
'error': {
|
||
'type': 'circuit_breaking_exception',
|
||
'reason': '[parent] Data too large, data for [<http_request>] would be [123848638/118.1mb], which is larger than the limit of [123273216/117.5mb], real usage: [120182112/114.6mb], new bytes reserved: [3666526/3.4mb]',
|
||
'bytes_wanted': 123848638,
|
||
'bytes_limit': 123273216,
|
||
'durability': 'TRANSIENT'
|
||
},
|
||
'status': 429
|
||
}
|
||
----
|
||
// NOTCONSOLE
|
||
|
||
{es} also writes circuit breaker errors to <<logging,`elasticsearch.log`>>. This
|
||
is helpful when automated processes, such as allocation, trigger a circuit
|
||
breaker.
|
||
|
||
[source,txt]
|
||
----
|
||
Caused by: org.elasticsearch.common.breaker.CircuitBreakingException: [parent] Data too large, data for [<transport_request>] would be [num/numGB], which is larger than the limit of [num/numGB], usages [request=0/0b, fielddata=num/numKB, in_flight_requests=num/numGB, accounting=num/numGB]
|
||
----
|
||
|
||
**Check JVM memory usage**
|
||
|
||
If you've enabled Stack Monitoring, you can view JVM memory usage in {kib}. In
|
||
the main menu, click **Stack Monitoring**. On the Stack Monitoring **Overview**
|
||
page, click **Nodes**. The **JVM Heap** column lists the current memory usage
|
||
for each node.
|
||
|
||
You can also use the <<cat-nodes,cat nodes API>> to get the current
|
||
`heap.percent` for each node.
|
||
|
||
[source,console]
|
||
----
|
||
GET _cat/nodes?v=true&h=name,node*,heap*
|
||
----
|
||
|
||
To get the JVM memory usage for each circuit breaker, use the
|
||
<<cluster-nodes-stats,node stats API>>.
|
||
|
||
[source,console]
|
||
----
|
||
GET _nodes/stats/breaker
|
||
----
|
||
|
||
[discrete]
|
||
[[prevent-circuit-breaker-errors]]
|
||
==== Prevent circuit breaker errors
|
||
|
||
**Reduce JVM memory pressure**
|
||
|
||
High JVM memory pressure often causes circuit breaker errors. See
|
||
<<high-jvm-memory-pressure>>.
|
||
|
||
**Avoid using fielddata on `text` fields**
|
||
|
||
For high-cardinality `text` fields, fielddata can use a large amount of JVM
|
||
memory. To avoid this, {es} disables fielddata on `text` fields by default. If
|
||
you've enabled fielddata and triggered the <<fielddata-circuit-breaker,fielddata
|
||
circuit breaker>>, consider disabling it and using a `keyword` field instead.
|
||
See <<fielddata>>.
|
||
|
||
**Clear the fieldata cache**
|
||
|
||
If you've triggered the fielddata circuit breaker and can't disable fielddata,
|
||
use the <<indices-clearcache,clear cache API>> to clear the fielddata cache.
|
||
This may disrupt any in-flight searches that use fielddata.
|
||
|
||
[source,console]
|
||
----
|
||
POST _all/_cache/clear?fielddata=true
|
||
----
|
||
// TEST[s/^/PUT my-index\n/]
|
||
|
||
[discrete]
|
||
[[high-jvm-memory-pressure]]
|
||
=== High JVM memory pressure
|
||
|
||
High JVM memory usage can degrade cluster performance and trigger
|
||
<<circuit-breaker-errors,circuit breaker errors>>. To prevent this, we recommend
|
||
taking steps to reduce memory pressure if a node's JVM memory usage consistently
|
||
exceeds 85%.
|
||
|
||
[discrete]
|
||
[[diagnose-high-jvm-memory-pressure]]
|
||
==== Diagnose high JVM memory pressure
|
||
|
||
**Check JVM memory pressure**
|
||
|
||
include::{es-repo-dir}/tab-widgets/code.asciidoc[]
|
||
include::{es-repo-dir}/tab-widgets/jvm-memory-pressure-widget.asciidoc[]
|
||
|
||
**Check garbage collection logs**
|
||
|
||
As memory usage increases, garbage collection becomes more frequent and takes
|
||
longer. You can track the frequency and length of garbage collection events in
|
||
<<logging,`elasticsearch.log`>>. For example, the following event states {es}
|
||
spent more than 50% (21 seconds) of the last 40 seconds performing garbage
|
||
collection.
|
||
|
||
[source,log]
|
||
----
|
||
[timestamp_short_interval_from_last][INFO ][o.e.m.j.JvmGcMonitorService] [node_id] [gc][number] overhead, spent [21s] collecting in the last [40s]
|
||
----
|
||
|
||
[discrete]
|
||
[[reduce-jvm-memory-pressure]]
|
||
==== Reduce JVM memory pressure
|
||
|
||
**Reduce your shard count**
|
||
|
||
Every shard uses memory. In most cases, a small set of large shards uses fewer
|
||
resources than many small shards. For tips on reducing your shard count, see
|
||
<<size-your-shards>>.
|
||
|
||
**Avoid expensive searches**
|
||
|
||
Expensive searches can use large amounts of memory. To better track expensive
|
||
searches on your cluster, enable <<index-modules-slowlog,slow logs>>.
|
||
|
||
Expensive searches may have a large <<paginate-search-results,`size` argument>>,
|
||
use aggregations with a large number of buckets, or include
|
||
<<query-dsl-allow-expensive-queries,expensive queries>>. To prevent expensive
|
||
searches, consider the following setting changes:
|
||
|
||
* Lower the `size` limit using the
|
||
<<index-max-result-window,`index.max_result_window`>> index setting.
|
||
|
||
* Decrease the maximum number of allowed aggregation buckets using the
|
||
<<search-settings-max-buckets,search.max_buckets>> cluster setting.
|
||
|
||
* Disable expensive queries using the
|
||
<<query-dsl-allow-expensive-queries,`search.allow_expensive_queries`>> cluster
|
||
setting.
|
||
|
||
[source,console]
|
||
----
|
||
PUT _all/_settings
|
||
{
|
||
"index.max_result_window": 5000
|
||
}
|
||
|
||
PUT _cluster/settings
|
||
{
|
||
"persistent": {
|
||
"search.max_buckets": 20000,
|
||
"search.allow_expensive_queries": false
|
||
}
|
||
}
|
||
----
|
||
// TEST[s/^/PUT my-index\n/]
|
||
|
||
**Prevent mapping explosions**
|
||
|
||
Defining too many fields or nesting fields too deeply can lead to
|
||
<<mapping-limit-settings,mapping explosions>> that use large amounts of memory.
|
||
To prevent mapping explosions, use the <<mapping-settings-limit,mapping limit
|
||
settings>> to limit the number of field mappings.
|
||
|
||
**Spread out bulk requests**
|
||
|
||
While more efficient than individual requests, large <<docs-bulk,bulk indexing>>
|
||
or <<search-multi-search,multi-search>> requests can still create high JVM
|
||
memory pressure. If possible, submit smaller requests and allow more time
|
||
between them.
|
||
|
||
**Upgrade node memory**
|
||
|
||
Heavy indexing and search loads can cause high JVM memory pressure. To better
|
||
handle heavy workloads, upgrade your nodes to increase their memory capacity.
|
||
|
||
[discrete]
|
||
[[red-yellow-cluster-status]]
|
||
=== Red or yellow cluster status
|
||
|
||
A red or yellow cluster status indicates one or more shards are missing or
|
||
unallocated. These unassigned shards increase your risk of data loss and can
|
||
degrade cluster performance.
|
||
|
||
[discrete]
|
||
[[diagnose-cluster-status]]
|
||
==== Diagnose your cluster status
|
||
|
||
**Check your cluster status**
|
||
|
||
Use the <<cluster-health,cluster health API>>.
|
||
|
||
[source,console]
|
||
----
|
||
GET _cluster/health?filter_path=status,*_shards
|
||
----
|
||
|
||
A healthy cluster has a green `status` and zero `unassigned_shards`. A yellow
|
||
status means only replicas are unassigned. A red status means one or
|
||
more primary shards are unassigned.
|
||
|
||
**View unassigned shards**
|
||
|
||
To view unassigned shards, use the <<cat-shards,cat shards API>>.
|
||
|
||
[source,console]
|
||
----
|
||
GET _cat/shards?v=true&h=index,shard,prirep,state,node,unassigned.reason&s=state
|
||
----
|
||
|
||
Unassigned shards have a `state` of `UNASSIGNED`. The `prirep` value is `p` for
|
||
primary shards and `r` for replicas. The `unassigned.reason` describes why the
|
||
shard remains unassigned.
|
||
|
||
To get a more in-depth explanation of an unassigned shard's allocation status,
|
||
use the <<cluster-allocation-explain,cluster allocation explanation API>>. You
|
||
can often use details in the response to resolve the issue.
|
||
|
||
[source,console]
|
||
----
|
||
GET _cluster/allocation/explain?filter_path=index,node_allocation_decisions.node_name,node_allocation_decisions.deciders.*
|
||
{
|
||
"index": "my-index",
|
||
"shard": 0,
|
||
"primary": false,
|
||
"current_node": "my-node"
|
||
}
|
||
----
|
||
// TEST[s/^/PUT my-index\n/]
|
||
// TEST[s/"primary": false,/"primary": false/]
|
||
// TEST[s/"current_node": "my-node"//]
|
||
|
||
[discrete]
|
||
[[fix-red-yellow-cluster-status]]
|
||
==== Fix a red or yellow cluster status
|
||
|
||
A shard can become unassigned for several reasons. The following tips outline the
|
||
most common causes and their solutions.
|
||
|
||
**Re-enable shard allocation**
|
||
|
||
You typically disable allocation during a <<restart-cluster,restart>> or other
|
||
cluster maintenance. If you forgot to re-enable allocation afterward, {es} will
|
||
be unable to assign shards. To re-enable allocation, reset the
|
||
`cluster.routing.allocation.enable` cluster setting.
|
||
|
||
[source,console]
|
||
----
|
||
PUT _cluster/settings
|
||
{
|
||
"persistent" : {
|
||
"cluster.routing.allocation.enable" : null
|
||
}
|
||
}
|
||
----
|
||
|
||
**Recover lost nodes**
|
||
|
||
Shards often become unassigned when a data node leaves the cluster. This can
|
||
occur for several reasons, ranging from connectivity issues to hardware failure.
|
||
After you resolve the issue and recover the node, it will rejoin the cluster.
|
||
{es} will then automatically allocate any unassigned shards.
|
||
|
||
To avoid wasting resources on temporary issues, {es} <<delayed-allocation,delays
|
||
allocation>> by one minute by default. If you've recovered a node and don’t want
|
||
to wait for the delay period, you can call the <<cluster-reroute,cluster reroute
|
||
API>> with no arguments to start the allocation process. The process runs
|
||
asynchronously in the background.
|
||
|
||
[source,console]
|
||
----
|
||
POST _cluster/reroute
|
||
----
|
||
|
||
**Fix allocation settings**
|
||
|
||
Misconfigured allocation settings can result in an unassigned primary shard.
|
||
These settings include:
|
||
|
||
* <<shard-allocation-filtering,Shard allocation>> index settings
|
||
* <<cluster-shard-allocation-filtering,Allocation filtering>> cluster settings
|
||
* <<shard-allocation-awareness,Allocation awareness>> cluster settings
|
||
|
||
To review your allocation settings, use the <<indices-get-settings,get index
|
||
settings>> and <<cluster-get-settings,get cluster settings>> APIs.
|
||
|
||
[source,console]
|
||
----
|
||
GET my-index/_settings?flat_settings=true&include_defaults=true
|
||
|
||
GET _cluster/settings?flat_settings=true&include_defaults=true
|
||
----
|
||
// TEST[s/^/PUT my-index\n/]
|
||
|
||
You can change the settings using the <<indices-update-settings,update index
|
||
settings>> and <<cluster-update-settings,update cluster settings>> APIs.
|
||
|
||
**Allocate or reduce replicas**
|
||
|
||
To protect against hardware failure, {es} will not assign a replica to the same
|
||
node as its primary shard. If no other data nodes are available to host the
|
||
replica, it remains unassigned. To fix this, you can:
|
||
|
||
* Add a data node to the same tier to host the replica.
|
||
|
||
* Change the `index.number_of_replicas` index setting to reduce the number of
|
||
replicas for each primary shard. We recommend keeping at least one replica per
|
||
primary.
|
||
|
||
[source,console]
|
||
----
|
||
PUT _all/_settings
|
||
{
|
||
"index.number_of_replicas": 1
|
||
}
|
||
----
|
||
// TEST[s/^/PUT my-index\n/]
|
||
|
||
**Free up or increase disk space**
|
||
|
||
{es} uses a <<disk-based-shard-allocation,low disk watermark>> to ensure data
|
||
nodes have enough disk space for incoming shards. By default, {es} does not
|
||
allocate shards to nodes using more than 85% of disk space.
|
||
|
||
To check the current disk space of your nodes, use the <<cat-allocation,cat
|
||
allocation API>>.
|
||
|
||
[source,console]
|
||
----
|
||
GET _cat/allocation?v=true&h=node,shards,disk.*
|
||
----
|
||
|
||
If your nodes are running low on disk space, you have a few options:
|
||
|
||
* Upgrade your nodes to increase disk space.
|
||
|
||
* Delete unneeded indices to free up space. If you use {ilm-init}, you can
|
||
update your lifecycle policy to use <<ilm-searchable-snapshot,searchable
|
||
snapshots>> or add a delete phase. If you no longer need to search the data, you
|
||
can use a <<snapshot-restore,snapshot>> to store it off-cluster.
|
||
|
||
* If you no longer write to an index, use the <<indices-forcemerge,force merge
|
||
API>> or {ilm-init}'s <<ilm-forcemerge,force merge action>> to merge its
|
||
segments into larger ones.
|
||
+
|
||
[source,console]
|
||
----
|
||
POST my-index/_forcemerge
|
||
----
|
||
// TEST[s/^/PUT my-index\n/]
|
||
|
||
* If an index is read-only, use the <<indices-shrink-index,shrink index API>> or
|
||
{ilm-init}'s <<ilm-shrink,shrink action>> to reduce its primary shard count.
|
||
+
|
||
[source,console]
|
||
----
|
||
POST my-index/_shrink/my-shrunken-index
|
||
----
|
||
// TEST[s/^/PUT my-index\n{"settings":{"index.number_of_shards":2,"blocks.write":true}}\n/]
|
||
|
||
* If your node has a large disk capacity, you can increase the low disk
|
||
watermark or set it to an explicit byte value.
|
||
+
|
||
[source,console]
|
||
----
|
||
PUT _cluster/settings
|
||
{
|
||
"persistent": {
|
||
"cluster.routing.allocation.disk.watermark.low": "30gb"
|
||
}
|
||
}
|
||
----
|
||
// TEST[s/"30gb"/null/]
|
||
|
||
**Reduce JVM memory pressure**
|
||
|
||
Shard allocation requires JVM heap memory. High JVM memory pressure can trigger
|
||
<<circuit-breaker,circuit breakers>> that stop allocation and leave shards
|
||
unassigned. See <<high-jvm-memory-pressure>>.
|
||
|
||
**Recover data for a lost primary shard**
|
||
|
||
If a node containing a primary shard is lost, {es} can typically replace it
|
||
using a replica on another node. If you can't recover the node and replicas
|
||
don't exist or are irrecoverable, you'll need to re-add the missing data from a
|
||
<<snapshot-restore,snapshot>> or the original data source.
|
||
|
||
WARNING: Only use this option if node recovery is no longer possible. This
|
||
process allocates an empty primary shard. If the node later rejoins the cluster,
|
||
{es} will overwrite its primary shard with data from this newer empty shard,
|
||
resulting in data loss.
|
||
|
||
Use the <<cluster-reroute,cluster reroute API>> to manually allocate the
|
||
unassigned primary shard to another data node in the same tier. Set
|
||
`accept_data_loss` to `true`.
|
||
|
||
[source,console]
|
||
----
|
||
POST _cluster/reroute
|
||
{
|
||
"commands": [
|
||
{
|
||
"allocate_empty_primary": {
|
||
"index": "my-index",
|
||
"shard": 0,
|
||
"node": "my-node",
|
||
"accept_data_loss": "true"
|
||
}
|
||
}
|
||
]
|
||
}
|
||
----
|
||
// TEST[s/^/PUT my-index\n/]
|
||
// TEST[catch:bad_request]
|
||
|
||
If you backed up the missing index data to a snapshot, use the
|
||
<<restore-snapshot-api,restore snapshot API>> to restore the individual index.
|
||
Alternatively, you can index the missing data from the original data source.
|