mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 07:37:19 -04:00
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible. Relates to #79309, #31619
214 lines
No EOL
7.1 KiB
Text
214 lines
No EOL
7.1 KiB
Text
[role="xpack"]
|
|
[[index-lifecycle-error-handling]]
|
|
== Troubleshooting {ilm} errors
|
|
|
|
When {ilm-init} executes a lifecycle policy, it's possible for errors to occur
|
|
while performing the necessary index operations for a step.
|
|
When this happens, {ilm-init} moves the index to an `ERROR` step.
|
|
If {ilm-init} cannot resolve the error automatically, execution is halted
|
|
until you resolve the underlying issues with the policy, index, or cluster.
|
|
|
|
For example, you might have a `shrink-index` policy that shrinks an index to four shards once it
|
|
is at least five days old:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT _ilm/policy/shrink-index
|
|
{
|
|
"policy": {
|
|
"phases": {
|
|
"warm": {
|
|
"min_age": "5d",
|
|
"actions": {
|
|
"shrink": {
|
|
"number_of_shards": 4
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST
|
|
|
|
There is nothing that prevents you from applying the `shrink-index` policy to a new
|
|
index that has only two shards:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT /my-index-000001
|
|
{
|
|
"settings": {
|
|
"index.number_of_shards": 2,
|
|
"index.lifecycle.name": "shrink-index"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
After five days, {ilm-init} attempts to shrink `my-index-000001` from two shards to four shards.
|
|
Because the shrink action cannot _increase_ the number of shards, this operation fails
|
|
and {ilm-init} moves `my-index-000001` to the `ERROR` step.
|
|
|
|
You can use the <<ilm-explain-lifecycle,{ilm-init} Explain API>> to get information about
|
|
what went wrong:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
GET /my-index-000001/_ilm/explain
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
Which returns the following information:
|
|
|
|
[source,console-result]
|
|
--------------------------------------------------
|
|
{
|
|
"indices" : {
|
|
"my-index-000001" : {
|
|
"index" : "my-index-000001",
|
|
"managed" : true,
|
|
"policy" : "shrink-index", <1>
|
|
"lifecycle_date_millis" : 1541717265865,
|
|
"age": "5.1d", <2>
|
|
"phase" : "warm", <3>
|
|
"phase_time_millis" : 1541717272601,
|
|
"action" : "shrink", <4>
|
|
"action_time_millis" : 1541717272601,
|
|
"step" : "ERROR", <5>
|
|
"step_time_millis" : 1541717272688,
|
|
"failed_step" : "shrink", <6>
|
|
"step_info" : {
|
|
"type" : "illegal_argument_exception", <7>
|
|
"reason" : "the number of target shards [4] must be less that the number of source shards [2]"
|
|
},
|
|
"phase_execution" : {
|
|
"policy" : "shrink-index",
|
|
"phase_definition" : { <8>
|
|
"min_age" : "5d",
|
|
"actions" : {
|
|
"shrink" : {
|
|
"number_of_shards" : 4
|
|
}
|
|
}
|
|
},
|
|
"version" : 1,
|
|
"modified_date_in_millis" : 1541717264230
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[skip:no way to know if we will get this response immediately]
|
|
|
|
<1> The policy being used to manage the index: `shrink-index`
|
|
<2> The index age: 5.1 days
|
|
<3> The phase the index is currently in: `warm`
|
|
<4> The current action: `shrink`
|
|
<5> The step the index is currently in: `ERROR`
|
|
<6> The step that failed to execute: `shrink`
|
|
<7> The type of error and a description of that error.
|
|
<8> The definition of the current phase from the `shrink-index` policy
|
|
|
|
To resolve this, you could update the policy to shrink the index to a single shard after 5 days:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
PUT _ilm/policy/shrink-index
|
|
{
|
|
"policy": {
|
|
"phases": {
|
|
"warm": {
|
|
"min_age": "5d",
|
|
"actions": {
|
|
"shrink": {
|
|
"number_of_shards": 1
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TEST[continued]
|
|
|
|
[discrete]
|
|
=== Retrying failed lifecycle policy steps
|
|
|
|
Once you fix the problem that put an index in the `ERROR` step,
|
|
you might need to explicitly tell {ilm-init} to retry the step:
|
|
|
|
[source,console]
|
|
--------------------------------------------------
|
|
POST /my-index-000001/_ilm/retry
|
|
--------------------------------------------------
|
|
// TEST[skip:we can't be sure the index is ready to be retried at this point]
|
|
|
|
{ilm-init} subsequently attempts to re-run the step that failed.
|
|
You can use the <<ilm-explain-lifecycle,{ilm-init} Explain API>> to monitor the progress.
|
|
|
|
[discrete]
|
|
=== Common {ilm-init} errors
|
|
|
|
Here's how to resolve the most common errors reported in the `ERROR` step.
|
|
|
|
TIP: Problems with rollover aliases are a common cause of errors.
|
|
Consider using <<data-streams, data streams>> instead of managing rollover with aliases.
|
|
|
|
[discrete]
|
|
==== Rollover alias [x] can point to multiple indices, found duplicated alias [x] in index template [z]
|
|
|
|
The target rollover alias is specified in an index template's `index.lifecycle.rollover_alias` setting.
|
|
You need to explicitly configure this alias _one time_ when you
|
|
<<ilm-gs-alias-bootstrap, bootstrap the initial index>>.
|
|
The rollover action then manages setting and updating the alias to
|
|
<<rollover-index-api-desc, roll over>> to each subsequent index.
|
|
|
|
Do not explicitly configure this same alias in the aliases section of an index template.
|
|
|
|
[discrete]
|
|
==== index.lifecycle.rollover_alias [x] does not point to index [y]
|
|
|
|
Either the index is using the wrong alias or the alias does not exist.
|
|
|
|
Check the `index.lifecycle.rollover_alias` <<indices-get-settings, index setting>>.
|
|
To see what aliases are configured, use <<cat-alias, _cat/aliases>>.
|
|
|
|
[discrete]
|
|
==== Setting [index.lifecycle.rollover_alias] for index [y] is empty or not defined
|
|
|
|
The `index.lifecycle.rollover_alias` setting must be configured for the rollover action to work.
|
|
|
|
Update the index settings to set `index.lifecycle.rollover_alias`.
|
|
|
|
[discrete]
|
|
==== Alias [x] has more than one write index [y,z]
|
|
|
|
Only one index can be designated as the write index for a particular alias.
|
|
|
|
Use the <<indices-aliases, aliases>> API to set `is_write_index:false` for all but one index.
|
|
|
|
[discrete]
|
|
==== index name [x] does not match pattern ^.*-\d+
|
|
|
|
The index name must match the regex pattern `^.*-\d+` for the rollover action to work.
|
|
The most common problem is that the index name does not contain trailing digits.
|
|
For example, `my-index` does not match the pattern requirement.
|
|
|
|
Append a numeric value to the index name, for example `my-index-000001`.
|
|
|
|
[discrete]
|
|
==== CircuitBreakingException: [x] data too large, data for [y]
|
|
|
|
This indicates that the cluster is hitting resource limits.
|
|
|
|
Before continuing to set up {ilm-init}, you'll need to take steps to alleviate the resource issues.
|
|
For more information, see <<circuit-breaker-errors>>.
|
|
|
|
[discrete]
|
|
==== High disk watermark [x] exceeded on [y]
|
|
|
|
This indicates that the cluster is running out of disk space.
|
|
This can happen when you don't have {ilm} set up to roll over from hot to warm nodes.
|
|
|
|
Consider adding nodes, upgrading your hardware, or deleting unneeded indices. |