Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.
The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.
Closes#54151Closes#2869
This commit adds the `data_frozen` node role as part of the formalization of data tiers. It also
adds the `"frozen"` phase to ILM, currently allowing the same actions as the existing cold phase.
The frozen phase is intended to be used for data even less frequently searched than the cold phase,
and will eventually be loosely tied to data using partial searchable snapshots (as oppposed to full
searchable snapshots in the cold phase).
Relates to #60848
This commit sets the recovery rate for dedicated cold nodes. The goal is
here is enhance performance of recovery in a dedicated cold tier, where
we expect such nodes to be predominantly using searchable snapshots to
back the indices located on them. This commit follows a simple approach
where we increase the recovery rate as a function of the node size, for
nodes that appear to be dedicated cold nodes.
This commit removes the following deprecated settings in v8:
- `gateway.expected_nodes`
- `gateway.expected_master_nodes`
- `gateway.recover_after_nodes`
- `gateway.recover_after_master_nodes`
Co-authored-by: ShawnLi1014 <shawnli1014@gmail.com>
* [DOCS] Minor rewording for HTTP settings.
* Revert "[DOCS] Minor rewording for HTTP settings."
This reverts commit 9a831adca6.
* Adds advanced wording to HTTP & transport settings.
Today's network config docs are split into "Network", "HTTP" and
"Transport" pages, with unclear relationships between them. We often
encounter users with weird configs that indicate they don't really
understand how these settings all relate. In fact these pages are all
very interrelated, and the HTTP and Transport pages are almost all only
for advanced users. This commit brings these docs into a single page and
rewords some things to try and guide users away from the advanced
settings unless their configuration needs all the extra complexity.
It also adds a section entitled "Binding and publishing" which clarifies
the meanings of the `bind_host` and `publish_host` parameters. This is
also a common source of confusion amongst users.
It also clarifies that many of these settings accept a list of
addresses, and warns that this may not be what you want. Closes#67956.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Today the discovery phase has a short 1-second timeout for handshaking
with a remote node after connecting, which allows it to quickly move on
and retry in the case of connecting to something that doesn't respond
straight away (e.g. it isn't an Elasticsearch node).
This short timeout was necessary when the component was first developed
because each connection attempt would block a thread. Since #42636 the
connection attempt is now nonblocking so we can apply a more relaxed
timeout.
If transport security is enabled then our handshake timeout applies to
the TLS handshake followed by the Elasticsearch handshake. If the TLS
handshake alone takes over a second then the whole handshake times out
with a `ConnectTransportException`, but this does not tell us which of
the two individual handshakes took so long.
TLS handshakes have their own 10-second timeout, which if reached yields
a `SslHandshakeTimeoutException` that allows us to distinguish a problem
at the TLS level from one at the Elasticsearch level. Therefore this
commit extends the discovery probe timeouts.
limit the depth of nested bool queries
Introduce a new node level setting `indices.query.bool.max_nested_depth`
that controls the depth of nested bool queries.
Throw an error if a nested depth of a bool query exceeds the maximum
allowed nested depth.
Closes#55303
This makes sure that we only serve a hit from the request cache if it
was build using the same mapping and that the same mapping is used for
the entire "query phase" of the search.
Closes#62033
Today we document the use of `_[networkInterface]_` to specify the
addresses of a network interface but do not spell out which parts of
this syntax should be taken literally and which are part of the
placeholder for the interface name. If you get it wrong then the
exception message is confusing too since it uses the results of
`NetworkInterface#toString()` which contains much more than just the
name of the interface.
This commit clarifies the docs and the exception message.
Closes#65978.
Today we document the settings used to control rebalancing and
disk-based shard allocation but there isn't really any discussion around
what these processes do so it's hard to know what, if any, adjustments
to make.
This commit adds some words to help folk understand this area better.
* Clarify that field data cache includes global ordinals
* Describe that the cache should be cleared once the limit is reached
* Clarify that the `_id` field does not supported aggregations anymore
* Fold the `fielddata` mapping parameter page into the `text field docs
* Improve cross-linking
Clarifies differences between the
`cluster.routing.allocation.total_shards_per_node` and
`cluster.max_shards_per_node` cluster settings.
Closes#51839
Co-authored-by: Gordon Brown <arcsech@gmail.com>
I'm not sure if this setting was left here deliberately? or by accident?
With all other node role definition has changed syntax from `node.xxx` to `node.roles: [ ]`, the ingest one is the only one left behind.
This adds general overview documentation for data tiers,
the data tiers specific node roles, and their application in
ILM.
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: debadair <debadair@elastic.co>
This commit adds a dedicated threadpool for system index write
operations. The dedicated resources for system index writes serves as
a means to ensure that user activity does not block important system
operations from occurring such as the management of users and roles.
* updated shard limit doc
As the documentation was not so clear. I have updated saying this limit includes open indices with unassigned primaries and replicas count towards the limit.
* [DOCS] Incorporated edits.
Co-authored-by: Deb Adair <debadair@elastic.co>
No-op changes to:
* Move `Search your data` source files into the same directory
* Rename `Search your data` source files based on page ID
* Remove unneeded includes
* Remove the `Request` dir
* First crack at rewriting the CCR introduction.
* Emphasizing Kibana in configuring CCR (part one).
* Many more edits, plus new files.
* Fixing test case.
* Removing overview page and consolidating that information in the main page.
* Adding redirects for moved and deleted pages.
* Removing, consolidating, and adding redirects.
* Fixing duplicate ID in redirects and removing outdated reference.
* Adding test case and steps for recreating a follower index.
* Adding steps for managing CCR tasks in Kibana.
* Adding tasks for managing auto-follow patterns.
* Fixing glossary link.
* Fixing glossary link, again.
* Updating the upgrade information and other stuff.
* Apply suggestions from code review
* Incorporating review feedback.
* Adding more edits.
* Fixing link reference.
* Adding use cases for #59812.
* Incorporating feedback from reviewers.
* Apply suggestions from code review
* Incorporating more review comments.
* Condensing some of the steps for accessing Kibana.
* Incorporating small changes from reviewers.
Co-authored-by: debadair <debadair@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Followup to #60216, fixing the formatting of
`transport.tcp.reuse_address` and clarifying some wording around the
distinction between the transport and HTTP layers.