The docs for forced awareness indicate that no replicas will be assigned
until all zones are available, which is definitely undesirable and also
not the actual behaviour. This commit fixes the wording to match what
really happens.
Closes#104777
* Adds a prerequisites section covering remote cluster config, node roles, and security.
* Moves existing content about remote cluster config to the prereqs.
* Updates the remote cluster docs to include information about eligible gateway nodes and tagging for gateway nodes.
Closes https://github.com/elastic/elasticsearch/issues/72001
(cherry picked from commit 7142b47e69)
This PR changes uses of transient cluster settings to
persistent cluster settings.
The PR also deprecates the transient settings usage.
Relates to #49540
Orchestrated environments should not allow users to override
`cluster.routing.allocation.disk.threshold_enabled`, so making this
operator only.
Closes#77846
Co-authored-by: David Turner <david.turner@elastic.co>
* [DOCS] Update remote cluster docs
* Add files, rename files, write new stuff
* Plethora of changes
* Add test and update snippets
* Redirects, moved files, and test updates
* Moved file to x-pack for tests
* Remove older CCS page and add redirects
* Cleanup, link updates, and some rewrites
* Update image
* Incorporating user feedback and rewriting much of the remote clusters page
* More changes from review feedback
* Numerous updates, including request examples for CCS and Kibana
* More changes from review feedback
* Minor clarifications on security for remote clusters
* Incorporate review feedback
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Some review feedback and some editorial changes
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
# Conflicts:
# docs/reference/modules/network.asciidoc
# docs/reference/modules/remote-clusters.asciidoc
# x-pack/docs/en/security/ccs-clients-integrations/cross-cluster.asciidoc
# x-pack/docs/en/security/ccs-clients-integrations/index.asciidoc
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.
Relates #71844
Frozen indices (partial searchable snapshots) require less heap per
shard and the limit can therefore be raised for those. We pick 3000
frozen shards per frozen data node, since we think 2000 is reasonable
to use in production.
Relates #71042 and #34021
Includes #71781 and #71777
Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.
The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.
Closes#54151Closes#2869
Today we document the settings used to control rebalancing and
disk-based shard allocation but there isn't really any discussion around
what these processes do so it's hard to know what, if any, adjustments
to make.
This commit adds some words to help folk understand this area better.
Clarifies differences between the
`cluster.routing.allocation.total_shards_per_node` and
`cluster.max_shards_per_node` cluster settings.
Closes#51839
Co-authored-by: Gordon Brown <arcsech@gmail.com>
This adds general overview documentation for data tiers,
the data tiers specific node roles, and their application in
ILM.
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: debadair <debadair@elastic.co>
(cherry picked from commit d588cab747)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: debadair <debadair@elastic.co>
* updated shard limit doc
As the documentation was not so clear. I have updated saying this limit includes open indices with unassigned primaries and replicas count towards the limit.
* [DOCS] Incorporated edits.
Co-authored-by: Deb Adair <debadair@elastic.co>
Co-authored-by: gadekishore <50092970+gadekishore@users.noreply.github.com>
Part of #48366. Add documentation for the dangling indices
API added in #58176.
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Adam Locke <adam.locke@elastic.co>
* Adding ESS icons to supported ES settings.
* Adding new file for supported ESS settings.
* Adding supported ESS settings for HTTP and disk-based shard allocation.
* Adding more supported settings for ESS.
* Adding descriptions for each Cloud section, plus additional settings.
* Adding new warehouse file for Cloud, plus additional settings.
* Adding node settings for Cloud.
* Adding audit settings for Cloud.
* Resolving merge conflict.
* Adding SAML settings (part 1).
* Adding SAML realm encryption and signing settings.
* Adding SAML SSL settings.
* Adding Kerberos realm settings.
* Adding OpenID Connect Realm settings.
* Adding OpenID Connect SSL settings.
* Resolving leftover Git merge markers.
* Removing Cloud settings page and link to it.
* Add link to mapping source
* Update docs/reference/docs/reindex.asciidoc
* Incorporate edit of HTTP settings
* Remove "cloud" from tag and ID
* Remove "cloud" from tag and update description
* Remove "cloud" from tag and ID
* Change "whitelists" to "specifies"
* Remove "cloud" from end tag
* Removing cloud from IDs and tags.
* Changing link reference to fix build issue.
* Adding index management page for missing settings.
* Removing warehouse file for Cloud and moving settings elsewhere.
* Clarifying true/false usage of http.detailed_errors.enabled.
* Changing underscore to dash in link to fix ci build.
The read-only-allow-delete block is not really under the user's control
since Elasticsearch adds/removes it automatically. This commit removes
support for it from the new API for adding blocks to indices that was
introduced in #58094.
The disk decider had special handling for the single data node case,
allowing any allocation (skipping watermark checks) for such clusters.
This special handling can now be avoided via a setting.
Today the docs say that the low watermark has no effect on any shards that have
never been allocated, but this is confusing. Here "shard" means "replication
group" not "shard copy" but this conflicts with the "never been allocated"
qualifier since one allocates shard copies and not replication groups.
This commit removes the misleading words. A newly-created replication group
remains newly-created until one of its copies is assigned, which might be quite
some time later, but it seems better to leave this implicit.
Setting `cluster.routing.allocation.disk.include_relocations` to `false` is a
bad idea since it will lead to the kinds of overshoot that were otherwise fixed
in #46079. This commit deprecates this setting so it can be removed in the next
major release.
This is a follow up of #19191 for 7.x.
This change adds a system property called "es.routing.search_ignore_awareness_attributes" that when set to true will
effectively ignore allocation awareness attributes when routing search and get requests. This is now the default in 8.x so this
commit adds a way to opt-in to this new behavior in a minor version of 7.x.
Relates #45735
Adds to the `index.blocks.read_only_allow_delete` docs the information that
this block may be added or removed automatically, and rewords the
breaking-changes docs to mention the blocks explicitly and to recommend using a
different block.
Relates #42559
If a node exceeds the flood-stage disk watermark then we add a block to all of
its indices to prevent further writes as a last-ditch attempt to prevent the
node completely exhausting its disk space. However today this block remains in
place until manually removed, and this block is a source of confusion for users
who current have ample disk space and did not even realise they nearly ran out
at some point in the past.
This commit changes our behaviour to automatically remove this block when a
node drops below the high watermark again. The expectation is that the high
watermark is some distance below the flood-stage watermark and therefore the
disk space problem is truly resolved.
Fixes#39334
Previously persistent task assignment was checked in the
following situations:
- Persistent tasks are changed
- A node joins or leaves the cluster
- The routing table is changed
- Custom metadata in the cluster state is changed
- A new master node is elected
However, there could be situations when a persistent
task that could not be assigned to a node could become
assignable due to some other change, such as memory
usage on the nodes.
This change adds a timed recheck of persistent task
assignment to account for such situations. The timer
is suspended while checks triggered by cluster state
changes are in-flight to avoid adding burden to an
already busy cluster.
Closes#35792
The new limit on the number of open shards in a cluster may be
interpreted by users as a sizing recommendation, but it is not. This
clarifies in the documentation that this is a safety limit, not a
recommendation.
This removes the option to run a cluster without enforcing the
cluster-wide shard limit, making strict enforcement the default and only
behavior. The limit can still be adjusted as desired using the cluster
settings API.
In a future major version, we will be introducing a soft limit on the
number of shards in a cluster based on the number of nodes in the
cluster. This limit will be configurable, and checked on operations
which create or open shards and issue a warning if the operation would
take the cluster over the limit.
There is an option to enable strict enforcement of the limit, which
turns the warnings into errors. In a future release, the option will be
removed and strict enforcement will be the default (and only) behavior.
As user-defined cluster metadata is accessible to anyone with access to
get the cluster settings, stored in the logs, and likely to be tracked
by monitoring solutions, it is useful to clarify in the documentation
that it should not be used to store secret information.
Adds a place for users to store cluster-wide data they wish to associate
with the cluster via the Cluster Settings API. This is strictly for
user-defined data, Elasticsearch makes no other other use of these
settings.
Control max size and count of warning headers
Add a static persistent cluster level setting
"http.max_warning_header_count" to control the maximum number of
warning headers in client HTTP responses.
Defaults to unbounded.
Add a static persistent cluster level setting
"http.max_warning_header_size" to control the maximum total size of
warning headers in client HTTP responses.
Defaults to unbounded.
With every warning header that exceeds these limits,
a message will be logged in the main ES log,
and any more warning headers for this response will be
ignored.
This commit adds a new setting `cluster.persistent_tasks.allocation.enable`
that can be used to enable or disable the allocation of persistent tasks.
The setting accepts the values `all` (default) or `none`. When set to
none, the persistent tasks that are created (or that must be reassigned)
won't be assigned to a node but will reside in the cluster state with
a no "executor node" and a reason describing why it is not assigned:
```
"assignment" : {
"executor_node" : null,
"explanation" : "persistent task [foo/bar] cannot be assigned [no
persistent task assignments are allowed due to cluster settings]"
}
```
Update allocation awareness docs
Today, the docs imply that if multiple attributes are specified the the
whole combination of values is considered as a single entity when
performing allocation. In fact, each attribute is considered separately. This
change fixes this discrepancy.
It also replaces the use of the term "awareness zone" with "zone or domain", and
reformats some paragraphs to the right width.
Fixes#29105
Cluster settings shouldn't leak into the next test.
I played with failing the test if it left over any settings but that
felt like it added more ceremony then it was worth. The advantage is
that any test that intentionally wants to leave settings in place after
the test would fail and require looking at but, so far as I can tell, we
don't have any such tests.
Clear the disk watermark after the snippet showing users how to set it.
Without this our tests will fail if the disks have less than 10GB free.
Closes#28325