This commit adds support for floating point node.processors setting.
This is useful when the nodes run in an environment where the CPU
time assigned to the ES node process is limited (i.e. using cgroups).
With this change, the system would be able to size the thread pools
accordingly, in this case it would round up the provided setting
to the closest integer.
The docs for `transport.ping_schedule` note that the transport client
defaults to a 5s ping schedule, but this is no longer relevant. This
commit drops this from the docs, and also moves the docs for this
setting further down the page to reflect its relative unimportance.
Today we say that voting-only nodes require a "low-latency" network.
This term has a specific meaning in some operating environments which is
different from our intended meaning. To avoid this confusion this commit
removes the absolute term "low-latency" in favour of describing the
requirements relative to the user's own performance goals.
Clean up network setting docs
- Add types for all params
- Remove mention of JDKs before 11
- Clarify some wording
Co-authored-by: Stef Nestor <steffanie.nestor@gmail.com>
This change ensures that existing read_only_allow_delete blocks that
are placed on indices when the flood_stage watermark threshold is
exceeded, are removed when the disk threshold monitoring is disabled.
This is done by changing how InternalClusterInfoService behaves when
disabled. With this change, it will keep calling the registered
listeners periodically, but with an empty ClusterInfo.
Closes#86383
Our current default for the http.max_header_size setting is 8kb. This
is lower than the current default for Kibana (16kb in 8.x), and the ESS
proxy (1mb based on the Go http library default). To align with the
current convention of other Elastic components, this PR increases the
ES header size setting default to 16kb.
Closes#88501
* Convert disk watermarks to RelativeByteSizeValues
Similar to the existing watermark setting for the frozen tier.
Pre-requisite for PR 88639 that plans to introduce max headroom
settings for the disk watermarks, similar to the frozen tier max
headroom setting.
* Add changelog
* Revert 20gb to 20GB
* Make formatNoTrailingZerosPercent non static
* ByteSizeValue.MINUS_ONE
* Remove getMinimumTotalSizeForBelowWatermark
* Remove comment
* Fix minor stuff
* Make parsing of RelativeByteSizeValue faster
Mimicks older definitelyNotPercentage function
* Remove Locale from Strings.format
* More MINUS_ONE
* Adding discovery troubleshooting link
* Add tags to pull in discovery troubleshooting content
* Move discovery troubleshooting to separate page and add redirects
Co-authored-by: Adam Locke <adam.locke@elastic.co>
In #85074 we added docs on discovery troubleshooting that really only
talked about troubleshooting master elections. There's also the case
where the master is elected fine but some other node can't join it. This
commit adds troubleshooting docs about that too.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Fixes a few scalability issues around join validation:
- compresses the cluster state sent over the wire
- shares the serialized cluster state across multiple nodes
- forks the decompression/deserialization work off the transport thread
Relates #77466Closes#83204
Ensures that on every page of the docs that mentions
`cluster.initial_master_nodes` also mentions that this setting must be
removed after bootstrapping completes.
Today it's no longer true that by default nodes will auto-discover other
nodes on the same host and bootstrap them all into a cluster. This
commit fixes the docs on auto-bootstrapping to recognise this.
Today we don't really say anything about the requirements for the data
path in terms of correctness, and we specifically say to avoid NFS for
performance reasons. This isn't wholly accurate: some NFS
implementations work just fine. This commit documents a more balanced
position on local vs remote storage.
This moves the bulk of the upgrade information into the consolidated upgrade guide, but leaves the primary upgrade topic in place as a cross reference.
Relates to: https://github.com/elastic/stack-docs/pull/1970
Co-authored-by: gchaps <33642766+gchaps@users.noreply.github.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
(cherry picked from commit f6473d71f9)
Co-authored-by: debadair <debadair@elastic.co>
This commit updates the Operator-only functionality doc to
mention the operator only settings introduced in #82819.
It also adds an integration test for those operator only
settings that would have caught #83359.
As of 8.0, the compatibility window for cross-cluster search (CCS) to an earlier release will be one minor release. This updates the CCS docs and adds a related 8.0 breaking change.
Closes https://github.com/elastic/elasticsearch/issues/80782
* Adds a prerequisites section covering remote cluster config, node roles, and security.
* Moves existing content about remote cluster config to the prereqs.
* Updates the remote cluster docs to include information about eligible gateway nodes and tagging for gateway nodes.
Closes https://github.com/elastic/elasticsearch/issues/72001
Updates the remote clusters version compatibility table to include 7.17 and 8.x versions.
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
Today the same-shard allocation decider falls back to checking the
hostname if the node has no host address. In practice nodes will always
have an address so the fallback is dead code. This commit removes that
dead code.
Relates #80702 which will add the ability to distinguish nodes by
hostname regardless of whether they have an address or not, and #80767
which optimizes this area of code - this refactoring should make the
optimization simpler.
Today we increase the verbosity of discovery failures after 5 minutes
without a master. Unfortunately 5 minutes is a common orchestration
timeout, so if discovery is broken then we see nodes being shut down
just before they start to emit useful logs. This commit reduces the
default timeout to 3 minutes to address that.
We have a few leftover mentions of `zen` discovery, mostly for
historical/BwC reasons, which this commit removes.
Prior to this commit the default value for `discovery.type` was `zen`
but this was not written down anywhere or officially supported: the two
options were to set it to `single-node` or to omit it entirely. This
commit changes the default to `multi-node` and documents this.
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Today we have a short note in one place in the docs saying not to touch
the contents of the data path. This commit expands the warning to
describe more precisely what is forbidden, and to give some more detail
of the consequences, and also duplicates the warning to the other
location that documents the `path.data` setting.
Deprecate the script context cache in favor of the general cache.
Users should use the following settings:
`script.max_compilations_rate` to set the max compilation rate
for user scripts such as filter scripts. Certain script contexts
that submit scripts outside of the control of the user are
exempted from this rate limit. Examples include runtime fields,
ingest and watcher.
`script.cache.max_size` to set the max size of the cache.
`script.cache.expire` to set the expiration time for entries in
the cache.
Whats deprecated?
`script.max_compilations_rate: use-context`. This special
setting value was used to turn on the script context-specific caches.
`script.context.$CONTEXT.cache_max_size`, use `script.cache.max_size`
instead.
`script.context.$CONTEXT.cache_expire`, use `script.cache.expire`
instead.
`script.context.$CONTEXT.max_compilations_rate`, use
`script.max_compilations_rate` instead.
The default cache size was increased from `100` to `3000`, which
was approximately the max cache size when using context-specific caches.
The default compilation rate limit was increased from `75/5m` to
`150/5m` to account for increasing uses of scripts.
System script contexts can now opt-out of compilation rate limiting
using a flag rather than a sentinel rate limit value.
7.16: Script: Deprecate script context cache #79508
Refs: #62899
7.16: Script: Opt-out system contexts from script compilation rate limit #79459
Refs: #62899
Today we limit the max number of concurrent snapshot file restores
per recovery. This works well when the default
node_concurrent_recoveries is used (which is 2). When this limit is
increased, it is possible to exhaust the underlying repository
connection pool, affecting other workloads.
This commit adds a new setting
`indices.recovery.max_concurrent_snapshot_file_downloads_per_node` that
allows to limit the max number of snapshot file downloads per node
during recoveries. When a recovery starts in the target node it tries
to acquire a permit that allows it to download snapshot files when it is
granted. This is communicated to the source node in the
StartRecoveryRequest. This is a rather conservative approach since it is
possible that a recovery that gets a permit to use snapshot files
doesn't recover any snapshot file while there's a concurrent recovery
that doesn't get a permit could take advantage of recovering from a
snapshot.
Closes#79044
Changes can-match from a shard-level to a node-level action, which helps avoid an explosion of shard-level can-match
subrequests in clusters with many shards, that can cause stability issues. Also introduces a new search_coordination
thread pool to handle the sending and handling of node-level can-match requests.
This PR changes uses of transient cluster settings to
persistent cluster settings.
The PR also deprecates the transient settings usage.
Relates to #49540
* A typo error
a space between 'E' and 'cluster...'
* Update example, fix headings, change notes
Co-authored-by: Adam Locke <adam.locke@elastic.co>
Co-authored-by: Marwane Chahoud <marwane.chahoud@gmail.com>
* [DOCS] Fix default value for closed indices
#57953 introduced changes that added ESS icons to many Elasticsearch settings. As part of those changes, the default value for `cluster.indices.close.enable` was indicated as `false`, when it should be `true`. This PR updates the default value to `true`.
Closes#78877
* Update description
* Update note to remove outdated claims
The documentation indicates that `stack.templates.enabled` can be used in Elasticsearch Service, but it is not part of the settings allowlist in ESS. This PR makes the documentation match the state of the allowlist.
* Improve docs for pre-release version compatibility
Follow-up to #78317 clarifying a couple of points:
- a pre-release build can restore snapshots from released builds
- compatibility applies if at least one of the local or remote cluster
is a released build
* Remote cluster build date nit
The reference manual includes docs on version compatibility in various
places, but it's not clear that these docs only apply to released
versions and that the rules for pre-release versions are stricter than
folks expect. This commit adds some words to the docs for unreleased
versions which explains this subtlety.
* [DOCS] Update remote cluster docs
* Add files, rename files, write new stuff
* Plethora of changes
* Add test and update snippets
* Redirects, moved files, and test updates
* Moved file to x-pack for tests
* Remove older CCS page and add redirects
* Cleanup, link updates, and some rewrites
* Update image
* Incorporating user feedback and rewriting much of the remote clusters page
* More changes from review feedback
* Numerous updates, including request examples for CCS and Kibana
* More changes from review feedback
* Minor clarifications on security for remote clusters
* Incorporate review feedback
Co-authored-by: Yang Wang <ywangd@gmail.com>
* Some review feedback and some editorial changes
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
We currently use the plaintext body of a shard request as the key to the
request cache. This has the disadvantage that very large requests can
quickly fill up the cache due to the size of their keys. With this commit,
we instead use a sha-256 hash of the shard request as the cache key,
which will use a constant (and much smaller) number of bytes.