elasticsearch/docs/reference/shutdown/apis/shutdown-put.asciidoc
Lee Hinman 6e875d0fa9
Add node REPLACE shutdown implementation (#76247)
* WIP, basic implementation

* Pull `if` branch into a variable

* Remove outdated javadoc

* Remove map iteration, use target name instead of id (whoops)

* Remove streaming from isReplacementSource

* Simplify getReplacementName

* Only calculate node shutdowns if canRemain==false and forceMove==false

* Move canRebalance comment in BalancedShardsAllocator

* Rename canForceDuringVacate -> canForceAllocateDuringReplace

* Add comment to AwarenessAllocationDecider.canForceAllocateDuringReplace

* Revert changes to ClusterRebalanceAllocationDecider

* Change "no replacement" decision message in NodeReplacementAllocationDecider

* Only construct shutdown map once in isReplacementSource

* Make node shutdowns and target shutdowns available within RoutingAllocation

* Add randomization for adding the filter that is overridden in test

* Add integration test with replicas: 1

* Go nuts with the verbosity of allocation decisions

* Also check NODE_C in unit test

* Test with randomly assigned shard

* Fix test for extra verbose decision messages

* Remove canAllocate(IndexMetadat, RoutingNode, RoutingAllocation) overriding

* Spotless :|

* Implement 100% disk usage check during force-replace-allocate

* Add rudimentary documentation for "replace" shutdown type

* Use RoutingAllocation shutdown map in BalancedShardsAllocator

* Add canForceAllocateDuringReplace to AllocationDeciders & add test

* Switch from percentage to bytes in DiskThresholdDecider force check

* Enhance docs with note about rollover, creation, & shrink

* Clarify decision messages, add test for target-only allocation

* Simplify NodeReplacementAllocationDecider.replacementOngoing

* Start nodeC before nodeB in integration test

* Spotleeeessssssss! You get me every time!

* Remove outdated comment
2021-10-07 12:07:46 -04:00

115 lines
4.2 KiB
Text

[[put-shutdown]]
=== Put shutdown API
NOTE: {cloud-only}
Prepares a node to be shut down.
[[put-shutdown-api-request]]
==== {api-request-title}
`PUT _nodes/<node-id>/shutdown`
[[put-shutdown-api-prereqs]]
==== {api-prereq-title}
* If the {es} {security-features} are enabled, you must have the `manage`
<<privileges-list-cluster,cluster privilege>> to use this API.
* If the <<operator-privileges,{operator-feature}>> is enabled, you must be an operator
to use this API.
[[put-shutdown-api-desc]]
==== {api-description-title}
Migrates ongoing tasks and index shards to other nodes as needed
to prepare a node to be restarted or shut down and removed from the cluster.
This ensures that {es} can be stopped safely with minimal disruption to the cluster.
You must specify the type of shutdown: `restart`, `remove`, or `replace`.
If a node is already being prepared for shutdown,
you can use this API to change the shutdown type.
IMPORTANT: This API does *NOT* terminate the {es} process.
Monitor the <<get-shutdown,node shutdown status>> to determine
when it is safe to stop {es}.
[[put-shutdown-api-path-params]]
==== {api-path-parms-title}
`<node-id>`::
(Required, string)
The ID of the node you want to prepare for shutdown.
If you specify a node that is offline,
it will be prepared for shut down when it rejoins the cluster.
IMPORTANT: This parameter is *NOT* validated against the cluster's active nodes.
This enables you to register a node for shut down while it is offline.
No error is thrown if you specify an invalid node ID.
[[put-shutdown-api-params]]
==== {api-query-parms-title}
include::{es-repo-dir}/rest-api/common-parms.asciidoc[tag=timeoutparms]
[role="child_attributes"]
[[put-shutdown-api-request-body]]
==== {api-request-body-title}
`type`::
(Required, string)
Valid values are `restart`, `remove`, or `replace`.
Use `restart` when you need to temporarily shut down a node to perform an upgrade,
make configuration changes, or perform other maintenance.
Because the node is expected to rejoin the cluster, data is not migrated off of the node.
Use `remove` when you need to permanently remove a node from the cluster.
The node is not marked ready for shutdown until data is migrated off of the node
Use `replace` to do a 1:1 replacement of a node with another node. Certain allocation decisions will
be ignored (such as disk watermarks) in the interest of true replacement of the source node with the
target node. During a replace-type shutdown, rollover and index creation may result in unassigned
shards, and shrink may fail until the replacement is complete.
`reason`::
(Required, string)
A human-readable reason that the node is being shut down.
This field provides information for other cluster operators;
it does not affect the shut down process.
`allocation_delay`::
(Optional, string)
Only valid if `type` is `restart`. Controls how long {es} will wait for the node to restart and join the cluster before reassigning its shards to other nodes. This works the same as
<<delayed-allocation,delaying allocation>> with the `index.unassigned.node_left.delayed_timeout` setting. If you specify both a restart allocation delay and an index-level allocation delay, the longer of the two is used.
`target_node_name`::
(Optional, string)
Only valid if `type` is `replace`. Specifies the name of the node that is replacing the node being
shut down. Shards from the shut down node are only allowed to be allocated to the target node, and
no other data will be allocated to the target node. During relocation of data certain allocation
rules are ignored, such as disk watermarks or user attribute filtering rules.
[[put-shutdown-api-example]]
==== {api-examples-title}
Register a node for shutdown:
[source,console]
--------------------------------------------------
PUT /_nodes/USpTGYaBSIKbgSUJR2Z9lg/shutdown
{
"type": "restart", <1>
"reason": "Demonstrating how the node shutdown API works",
"allocation_delay": "20m"
}
--------------------------------------------------
////
[source,console-result]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
////
<1> Prepares the node to be restarted.
Use `remove` for nodes that will be permanently removed from the cluster.