* Update search-settings documentation to reflect the fact that the indices.query.bool.max_clause_count setting has been deprecated
* Fix indentation
* Replace Elasticsearch with {es}
* Add deprecation entry to release notes
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
Currently the documentation on network threading suggests that we still
use a model where we have individual workers dedicated to server
sockets. That is no longer true and server sockets are assigned to
normal workers. This commit updates the documentation.
* Update threadpool.asciidoc
Starting from 8.0 the value of the `node.processors` setting is bounded by the number of available
processors https://github.com/elastic/elasticsearch/pull/44894
* Update docs/reference/modules/threadpool.asciidoc
Co-authored-by: Adam Locke <adam.locke@elastic.co>
When parsing queries on the coordinating node, there is currently no way to share state between the different parsing methods (`fromXContent`). The only query that supports a parse context is bool query, which uses the context to track nested depth of queries, added with #66204. Such nested depth tracking mechanism is not 100% accurate as it tracks bool queries only, while there's many more query types that can hold other queries hence potentially cause stack overflow when deeply nested.
This change removes the parsing context that's specific to bool query, introduced with #66204, in favour of generalizing the nested depth tracking to all query types.
The generic tracking is introduced by wrapping the parser and overriding the method that parses named objects through the xcontent registry. Another way would have been to require a context argument when parsing queries, which would mean adding a context argument to all the QueryBuilder#fromXContent static methods. That would be a breaking change for plugins that provide custom queries, hence I went for trying out a different approach.
One aspect that this change requires and introduces is the distinction between parsing a top level query (which will wrap the parser, or it would create the context if we had one), as opposed to parsing an inner query, which goes ahead with the given parser and context. We already have this distinction as we have two different static methods in `AbstractQueryBuilder` but in practice only bool query makes the distinction being the only context-aware query.
In addition to generalizing tracking nested depth when parsing queries, we should be able to adopt this same strategy to track queries usage as part #90176 .
Given that the depth check is now more restrictive, as it counts all compound queries and not only bool, we have decided to raise the default limit to `30` to ensure that users are not going to hit the limit due to this change.
Adds to the docs a note that the `100mb` default for
`http.max_content_length` is the recommended maximum, along with
suggestions for what to do when hitting this limit.
Introduce max headroom settings for the low, high, and flood disk watermark stages, similar to the existing max headroom setting for the flood stage of the frozen tier. Introduce new max headrooms in HealthMetadata and in ReactiveStorageDeciderService. Add multiple tests in DiskThresholdDeciderUnitTests, DiskThresholdDeciderTests and DiskThresholdMonitorTests. Moreover, addition & subtraction for ByteSizeValue, and min.
This commit adds support for floating point node.processors setting.
This is useful when the nodes run in an environment where the CPU
time assigned to the ES node process is limited (i.e. using cgroups).
With this change, the system would be able to size the thread pools
accordingly, in this case it would round up the provided setting
to the closest integer.
The docs for `transport.ping_schedule` note that the transport client
defaults to a 5s ping schedule, but this is no longer relevant. This
commit drops this from the docs, and also moves the docs for this
setting further down the page to reflect its relative unimportance.
Today we say that voting-only nodes require a "low-latency" network.
This term has a specific meaning in some operating environments which is
different from our intended meaning. To avoid this confusion this commit
removes the absolute term "low-latency" in favour of describing the
requirements relative to the user's own performance goals.
Clean up network setting docs
- Add types for all params
- Remove mention of JDKs before 11
- Clarify some wording
Co-authored-by: Stef Nestor <steffanie.nestor@gmail.com>
This change ensures that existing read_only_allow_delete blocks that
are placed on indices when the flood_stage watermark threshold is
exceeded, are removed when the disk threshold monitoring is disabled.
This is done by changing how InternalClusterInfoService behaves when
disabled. With this change, it will keep calling the registered
listeners periodically, but with an empty ClusterInfo.
Closes#86383
Our current default for the http.max_header_size setting is 8kb. This
is lower than the current default for Kibana (16kb in 8.x), and the ESS
proxy (1mb based on the Go http library default). To align with the
current convention of other Elastic components, this PR increases the
ES header size setting default to 16kb.
Closes#88501
* Convert disk watermarks to RelativeByteSizeValues
Similar to the existing watermark setting for the frozen tier.
Pre-requisite for PR 88639 that plans to introduce max headroom
settings for the disk watermarks, similar to the frozen tier max
headroom setting.
* Add changelog
* Revert 20gb to 20GB
* Make formatNoTrailingZerosPercent non static
* ByteSizeValue.MINUS_ONE
* Remove getMinimumTotalSizeForBelowWatermark
* Remove comment
* Fix minor stuff
* Make parsing of RelativeByteSizeValue faster
Mimicks older definitelyNotPercentage function
* Remove Locale from Strings.format
* More MINUS_ONE
* Adding discovery troubleshooting link
* Add tags to pull in discovery troubleshooting content
* Move discovery troubleshooting to separate page and add redirects
Co-authored-by: Adam Locke <adam.locke@elastic.co>
In #85074 we added docs on discovery troubleshooting that really only
talked about troubleshooting master elections. There's also the case
where the master is elected fine but some other node can't join it. This
commit adds troubleshooting docs about that too.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Fixes a few scalability issues around join validation:
- compresses the cluster state sent over the wire
- shares the serialized cluster state across multiple nodes
- forks the decompression/deserialization work off the transport thread
Relates #77466Closes#83204
Ensures that on every page of the docs that mentions
`cluster.initial_master_nodes` also mentions that this setting must be
removed after bootstrapping completes.
Today it's no longer true that by default nodes will auto-discover other
nodes on the same host and bootstrap them all into a cluster. This
commit fixes the docs on auto-bootstrapping to recognise this.
Today we don't really say anything about the requirements for the data
path in terms of correctness, and we specifically say to avoid NFS for
performance reasons. This isn't wholly accurate: some NFS
implementations work just fine. This commit documents a more balanced
position on local vs remote storage.
This moves the bulk of the upgrade information into the consolidated upgrade guide, but leaves the primary upgrade topic in place as a cross reference.
Relates to: https://github.com/elastic/stack-docs/pull/1970
Co-authored-by: gchaps <33642766+gchaps@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
(cherry picked from commit f6473d71f9)
Co-authored-by: debadair <debadair@elastic.co>
This commit updates the Operator-only functionality doc to
mention the operator only settings introduced in #82819.
It also adds an integration test for those operator only
settings that would have caught #83359.
As of 8.0, the compatibility window for cross-cluster search (CCS) to an earlier release will be one minor release. This updates the CCS docs and adds a related 8.0 breaking change.
Closes https://github.com/elastic/elasticsearch/issues/80782
* Adds a prerequisites section covering remote cluster config, node roles, and security.
* Moves existing content about remote cluster config to the prereqs.
* Updates the remote cluster docs to include information about eligible gateway nodes and tagging for gateway nodes.
Closes https://github.com/elastic/elasticsearch/issues/72001
Updates the remote clusters version compatibility table to include 7.17 and 8.x versions.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Today the same-shard allocation decider falls back to checking the
hostname if the node has no host address. In practice nodes will always
have an address so the fallback is dead code. This commit removes that
dead code.
Relates #80702 which will add the ability to distinguish nodes by
hostname regardless of whether they have an address or not, and #80767
which optimizes this area of code - this refactoring should make the
optimization simpler.
Today we increase the verbosity of discovery failures after 5 minutes
without a master. Unfortunately 5 minutes is a common orchestration
timeout, so if discovery is broken then we see nodes being shut down
just before they start to emit useful logs. This commit reduces the
default timeout to 3 minutes to address that.
We have a few leftover mentions of `zen` discovery, mostly for
historical/BwC reasons, which this commit removes.
Prior to this commit the default value for `discovery.type` was `zen`
but this was not written down anywhere or officially supported: the two
options were to set it to `single-node` or to omit it entirely. This
commit changes the default to `multi-node` and documents this.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Today we have a short note in one place in the docs saying not to touch
the contents of the data path. This commit expands the warning to
describe more precisely what is forbidden, and to give some more detail
of the consequences, and also duplicates the warning to the other
location that documents the `path.data` setting.
Deprecate the script context cache in favor of the general cache.
Users should use the following settings:
`script.max_compilations_rate` to set the max compilation rate
for user scripts such as filter scripts. Certain script contexts
that submit scripts outside of the control of the user are
exempted from this rate limit. Examples include runtime fields,
ingest and watcher.
`script.cache.max_size` to set the max size of the cache.
`script.cache.expire` to set the expiration time for entries in
the cache.
Whats deprecated?
`script.max_compilations_rate: use-context`. This special
setting value was used to turn on the script context-specific caches.
`script.context.$CONTEXT.cache_max_size`, use `script.cache.max_size`
instead.
`script.context.$CONTEXT.cache_expire`, use `script.cache.expire`
instead.
`script.context.$CONTEXT.max_compilations_rate`, use
`script.max_compilations_rate` instead.
The default cache size was increased from `100` to `3000`, which
was approximately the max cache size when using context-specific caches.
The default compilation rate limit was increased from `75/5m` to
`150/5m` to account for increasing uses of scripts.
System script contexts can now opt-out of compilation rate limiting
using a flag rather than a sentinel rate limit value.
7.16: Script: Deprecate script context cache #79508
Refs: #62899
7.16: Script: Opt-out system contexts from script compilation rate limit #79459
Refs: #62899
Today we limit the max number of concurrent snapshot file restores
per recovery. This works well when the default
node_concurrent_recoveries is used (which is 2). When this limit is
increased, it is possible to exhaust the underlying repository
connection pool, affecting other workloads.
This commit adds a new setting
`indices.recovery.max_concurrent_snapshot_file_downloads_per_node` that
allows to limit the max number of snapshot file downloads per node
during recoveries. When a recovery starts in the target node it tries
to acquire a permit that allows it to download snapshot files when it is
granted. This is communicated to the source node in the
StartRecoveryRequest. This is a rather conservative approach since it is
possible that a recovery that gets a permit to use snapshot files
doesn't recover any snapshot file while there's a concurrent recovery
that doesn't get a permit could take advantage of recovering from a
snapshot.
Closes#79044
Changes can-match from a shard-level to a node-level action, which helps avoid an explosion of shard-level can-match
subrequests in clusters with many shards, that can cause stability issues. Also introduces a new search_coordination
thread pool to handle the sending and handling of node-level can-match requests.
This PR changes uses of transient cluster settings to
persistent cluster settings.
The PR also deprecates the transient settings usage.
Relates to #49540