Commit graph

844 commits

Author SHA1 Message Date
William Brafford
e72947e131
Add threadpool for critical operations on system indices (#72625) (#73731)
* Add new thread pool for critical operations
* Split critical thread pool into read and write
* Add POJO to hold thread pool names
* Add tests for critical thread pools
* Add thread pools to data streams
* Update settings for security plugin
* Retrieve ExecutorSelector from SystemIndices where possible
* Use a singleton ExecutorSelector
2021-06-03 14:16:02 -04:00
James Rodewig
69dc48a688
[DOCS] Note circuit breakers reject requests with 429 HTTP status code (#69864) (#73674)
We mention Elasticsearch returns 429 if the circuit breaker trips in https://www.elastic.co/blog/improving-node-resiliency-with-the-real-memory-circuit-breaker, but there is no mention in the docs.

This adds an xref to circuit breaker errors section.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
2021-06-02 10:41:47 -04:00
David Turner
a63ff44665 Increase PeerFinder verbosity on persistent failure (#73128)
If a node is partitioned away from the rest of the cluster then the
`ClusterFormationFailureHelper` periodically reports that it cannot
discover the expected collection of nodes, but does not indicate why. To
prove it's a connectivity problem, users must today restart the node
with `DEBUG` logging on `org.elasticsearch.discovery.PeerFinder` to see
further details.

With this commit we log messages at `WARN` level if the node remains
disconnected for longer than a configurable timeout, which defaults to 5
minutes.

Relates #72968
2021-05-17 10:55:09 +01:00
James Rodewig
47294c264b
[DOCS] Update 'shared_cache' references for searchable snapshots (#72775) (#72777) 2021-05-05 18:00:28 -04:00
Lisa Cawley
95f9fcedfb
[DOCS] Clarify remote_cluster_client is required to run ML (#72569) (#72729)
Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
2021-05-05 08:29:22 -07:00
David Turner
cdc533355d Trivial typo: bindiing -> binding 2021-04-27 12:21:26 +01:00
Henning Andersen
d7e7c5be96
Add separate flood stage limit for frozen (#71855) (#71941)
Dedicated frozen nodes can survive less headroom than other data nodes.
This commits introduces a separate flood stage threshold for frozen as
well as an accompanying max_headroom setting that caps the amount of
free space necessary on frozen.

Relates #71844
2021-04-20 16:57:10 +02:00
Henning Andersen
12314f6a13
Introduce separate shard limit for frozen shards (#71392) (#71760)
Frozen indices (partial searchable snapshots) require less heap per
shard and the limit can therefore be raised for those. We pick 3000
frozen shards per frozen data node, since we think 2000 is reasonable
to use in production.

Relates #71042 and #34021

Includes #71781 and #71777
2021-04-17 17:41:50 +02:00
James Rodewig
8d0149e053 [DOCS] Remove unneeded articles for Elasticsearch Service and Elastic Agent 2021-04-02 16:02:47 -04:00
James Rodewig
c757f9e4e7
[DOCS] Fix double spaces (#71082) (#71120) 2021-03-31 11:43:34 -04:00
Adam Locke
c677bd0fc0
[DOCS] [7.x] Overhaul TLS security docs (#68946) (#70880)
* Removing security overview and condensing.

* Adding new security file.

* Minor changes.

* Removing link to pass build.

* Adding minimal security page.

* Adding minimal security page.

* Changes to intro.

* Add basic and basic + http configurations.

* Lots of changes, removed files, and redirects.

* Moving some AD and LDAP sections, plus more redirects.

* Redirects for SAML.

* Updating snippet languages and redirects.

* Adding another SAML redirect.

* Hopefully fixing the ci/2 error.

* Fixing another broken link for SAML.

* Adding what's next sections and some cleanup.

* Removes both security tutorials from the TOC.

* Adding redirect for removed tutorial.

* Add graphic for Elastic Security layers.

* Incorporating reviewer feedback.

* Update x-pack/docs/en/security/securing-communications/security-basic-setup.asciidoc

Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>

* Update x-pack/docs/en/security/securing-communications/security-minimal-setup.asciidoc

Co-authored-by: Yang Wang <ywangd@gmail.com>

* Update x-pack/docs/en/security/securing-communications/security-basic-setup.asciidoc

Co-authored-by: Yang Wang <ywangd@gmail.com>

* Update x-pack/docs/en/security/index.asciidoc

Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>

* Update x-pack/docs/en/security/securing-communications/security-basic-setup-https.asciidoc

Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>

* Apply suggestions from code review

Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>

* Additional changes from review feedback.

* Incorporating reviewer feedback.

* Incorporating more reviewer feedback.

* Clarify that TLS is for authenticating nodes

Co-authored-by: Tim Vernum <tim@adjective.org>

* Clarify security between nodes

Co-authored-by: Tim Vernum <tim@adjective.org>

* Clarify that TLS is between nodes

Co-authored-by: Tim Vernum <tim@adjective.org>

* Update title for configuring Kibana with a password

Co-authored-by: Tim Vernum <tim@adjective.org>

* Move section for enabling passwords between Kibana and ES to minimal security.

* Add section for transport description, plus incorporate more reviewer feedback.

* Moving operator privileges lower in the navigation.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
Co-authored-by: Yang Wang <ywangd@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
2021-03-25 14:07:27 -04:00
James Rodewig
7f1ae62862
[DOCS] Add info about allowed profile names (#70440) (#70816)
Co-authored-by: Robin Clarke <robin.clarke@elastic.co>
2021-03-24 10:10:26 -04:00
David Turner
e15175fa1a Note recovery settings affect searchable snapshots (#70771)
Adds a short note that `max_restore_bytes_per_sec` and
`indices.recovery.max_bytes_per_sec` also affect the recovery of a
searchable snapshot index.
2021-03-24 09:23:04 +00:00
Henning Andersen
00d651389f
[DOCS] Frozen tier dedicated (#70542) (#70598)
The frozen tier is now dedicated for searchable snapshots mounted with
the `shared_cache` option. This commit adjusts docs accordingly.
2021-03-19 14:31:45 +01:00
David Turner
4db81532f0 Clarify persistence on master-eligible nodes (#70556)
We document that master nodes should have a persistent data path but
it's a bit hard to understand that this is what the docs are saying and
we don't really say why it's important. This commit clarifies this
paragraph.

Relates 49d0f3406c
2021-03-18 14:51:28 +00:00
David Turner
1877a7e895 Recommend no requests to dedicated masters (#70491)
Today the docs on node roles say that you shouldn't use dedicated
masters for heavy requests such as indexing and searching, but as per
the "designing for resilience" docs this guidance applies to all client
requests. This commit generalises the node roles docs slightly to
clarify this.

Relates #70435
2021-03-18 12:30:10 +00:00
James Rodewig
302341a526
[DOCS] Replace put with create or update in API names (#70330) (#70421)
Co-authored-by: debadair <debadair@elastic.co>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2021-03-15 17:16:13 -04:00
Jason Tedor
be5e775f01
Clarify remote_cluster_client role (#70186)
This commit addresses two aspects of the description in the docs of
configuring a local node to be a remote cluster client. First, the
documentation was referring to the legacy setting for configuring a
remote cluster client. Secondly, we clarify that additional features,
not only cross-cluster search, have requirements around the usage of the
remote_cluster_client role.

Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-03-11 20:28:52 -05:00
James Rodewig
e52a5ed1b9
[DOCS] Update ingest pipeline xrefs (#70178) (#70227) 2021-03-10 08:42:49 -05:00
Julie Tibshirani
46515e8295 Fix formatting for fixed_auto_queue_size deprecation note. 2021-03-08 15:05:14 -08:00
James Rodewig
f99cb3d2b6
[DOCS] Fix typo (#69838) (#69881)
Co-authored-by: Mike Barretta <mike.barretta@elastic.co>
2021-03-03 09:37:45 -05:00
James Rodewig
22981854ec
[DOCS] Fix typo (#69654) (#69707)
Co-authored-by: José Arthur Benetasso Villanova <jose.arthur@gmail.com>
2021-03-01 09:56:09 -05:00
James Rodewig
132c4f8c70
[DOCS] Remove outdated default distro refs (#69465) (#69470) 2021-02-23 13:13:09 -05:00
James Rodewig
5d709f4f47
[DOCS] Reword node roles docs (#69301) (#69459) 2021-02-23 12:02:53 -05:00
David Turner
1c1c3bb780 Skip zone/host awareness with auto-expand replicas (#69334)
Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.

The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.

Closes #54151
Closes #2869
2021-02-22 16:54:39 +00:00
James Rodewig
b55249507e
[DOCS] Fix typos for duplicate words (#69125) (#69132) 2021-02-17 11:16:58 -05:00
James Rodewig
f0c7ec5f31
[DOCS] Add data_frozen role to node docs (#68713) (#68715) 2021-02-08 17:59:07 -05:00
Lee Hinman
8aba3093b0
[7.x] Add the frozen tier node role and ILM phase (#68605) (#68612)
This commit adds the `data_frozen` node role as part of the formalization of data tiers. It also
adds the `"frozen"` phase to ILM, currently allowing the same actions as the existing cold phase.

The frozen phase is intended to be used for data even less frequently searched than the cold phase,
and will eventually be loosely tied to data using partial searchable snapshots (as oppposed to full
searchable snapshots in the cold phase).

Relates to #60848
2021-02-05 16:07:16 -07:00
Jason Tedor
35e2de9ddc
Set recovery rate for dedicated cold nodes (#68480)
This commit sets the recovery rate for dedicated cold nodes. The goal is
here is enhance performance of recovery in a dedicated cold tier, where
we expect such nodes to be predominantly using searchable snapshots to
back the indices located on them. This commit follows a simple approach
where we increase the recovery rate as a function of the node size, for
nodes that appear to be dedicated cold nodes.
2021-02-04 10:53:36 -05:00
James Rodewig
d8fb188389
[DOCS] Document the stack.templates.enabled setting (#68328) (#68366) 2021-02-02 09:19:04 -05:00
Adam Locke
a6bdf05733
[DOCS] Minor rewording for HTTP settings (#68295) (#68319)
* [DOCS] Minor rewording for HTTP settings.

* Revert "[DOCS] Minor rewording for HTTP settings."

This reverts commit 9a831adca6.

* Adds advanced wording to HTTP & transport settings.
2021-02-01 12:59:29 -05:00
David Turner
b2861bc804 Expand and consolidate networking docs (#68051)
Today's network config docs are split into "Network", "HTTP" and
"Transport" pages, with unclear relationships between them. We often
encounter users with weird configs that indicate they don't really
understand how these settings all relate. In fact these pages are all
very interrelated, and the HTTP and Transport pages are almost all only
for advanced users. This commit brings these docs into a single page and
rewords some things to try and guide users away from the advanced
settings unless their configuration needs all the extra complexity.

It also adds a section entitled "Binding and publishing" which clarifies
the meanings of the `bind_host` and `publish_host` parameters. This is
also a common source of confusion amongst users.

It also clarifies that many of these settings accept a list of
addresses, and warns that this may not be what you want. Closes #67956.

Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-02-01 13:37:29 +00:00
David Turner
aa60b4943c Extend default probe connect/handshake timeouts (#68059)
Today the discovery phase has a short 1-second timeout for handshaking
with a remote node after connecting, which allows it to quickly move on
and retry in the case of connecting to something that doesn't respond
straight away (e.g. it isn't an Elasticsearch node).

This short timeout was necessary when the component was first developed
because each connection attempt would block a thread. Since #42636 the
connection attempt is now nonblocking so we can apply a more relaxed
timeout.

If transport security is enabled then our handshake timeout applies to
the TLS handshake followed by the Elasticsearch handshake. If the TLS
handshake alone takes over a second then the whole handshake times out
with a `ConnectTransportException`, but this does not tell us which of
the two individual handshakes took so long.

TLS handshakes have their own 10-second timeout, which if reached yields
a `SslHandshakeTimeoutException` that allows us to distinguish a problem
at the TLS level from one at the Elasticsearch level. Therefore this
commit extends the discovery probe timeouts.
2021-01-27 16:42:10 +00:00
Lisa Cawley
33ae925d16
[DOCS] Clarifies default ML and transform node settings (#67671) (#67735) 2021-01-19 15:17:31 -08:00
Yang Cheng
9169e5ae33
limit the depth of nested bool queries
Limit the depth of nested bool queries

Introduce a new node level setting indices.query.bool.max_nested_depth
that controls the depth of nested bool queries.
Throw an error if a nested depth of a bool query exceeds the maximum
allowed nested depth.

Backport #66204
closes #55303
2021-01-13 06:11:49 -05:00
Nik Everett
2ebfe7d3b3
Bust the request cache when the mapping changes (backport of #66295) (#66795)
This makes sure that we only serve a hit from the request cache if it
was build using the same mapping and that the same mapping is used for
the entire "query phase" of the search.

Closes #62033
2020-12-23 15:46:45 -05:00
Lisa Cawley
5f8dd50daa
[DOCS] Clarify use of CCS on ML nodes (#66616) (#66763) 2020-12-22 10:37:22 -08:00
James Rodewig
e7d497be98
[DOCS] Remove duplicate word (#66320) (#66447)
Co-authored-by: Gao Ruifeng <gaoruifeng@users.noreply.github.com>
2020-12-16 10:49:58 -05:00
James Rodewig
e4bf2afd58
[DOCS] Fix search.max_buckets default (#66311) (#66312) 2020-12-15 08:17:50 -05:00
David Turner
1ce8e5d848 Clarify network interface setting (#66013)
Today we document the use of `_[networkInterface]_` to specify the
addresses of a network interface but do not spell out which parts of
this syntax should be taken literally and which are part of the
placeholder for the interface name. If you get it wrong then the
exception message is confusing too since it uses the results of
`NetworkInterface#toString()` which contains much more than just the
name of the interface.

This commit clarifies the docs and the exception message.

Closes #65978.
2020-12-09 08:42:12 +00:00
James Rodewig
823b3f4c19
[DOCS] Fix wording for HTTP settings (#65964) (#65967) 2020-12-07 12:33:00 -05:00
David Turner
a978c8c669 Expand docs on disk-based shard allocation (#65668)
Today we document the settings used to control rebalancing and
disk-based shard allocation but there isn't really any discussion around
what these processes do so it's hard to know what, if any, adjustments
to make.

This commit adds some words to help folk understand this area better.
2020-12-07 14:51:49 +00:00
Henning Andersen
e8e0405936 Searchable snapshot terminology (#65549)
We chose to use searchable snapshot index over snapshot-backed index, so
changed terminology towards this in a couple places.
2020-11-30 17:15:16 +01:00
Wylie Conlon
4d9f5b1867 Clarify field data cache behavior in docs (#64375)
* Clarify that field data cache includes global ordinals
* Describe that the cache should be cleared once the limit is reached
* Clarify that the `_id` field does not supported aggregations anymore
* Fold the `fielddata` mapping parameter page into the `text field docs
* Improve cross-linking
2020-11-20 13:56:02 -08:00
James Rodewig
4626ec3696
[DOCS] Clarify diff between shards per node settings (#64875) (#65070)
Clarifies differences between the
`cluster.routing.allocation.total_shards_per_node` and
`cluster.max_shards_per_node` cluster settings.

Closes #51839

Co-authored-by: Gordon Brown <arcsech@gmail.com>
2020-11-16 09:17:54 -05:00
Leaf-Lin
0bc141af5a remove node.ingest setting in the documentation (#64456)
I'm not sure if this setting was left here deliberately? or by accident?
With all other node role definition has changed syntax from `node.xxx` to `node.roles: [ ]`, the ingest one is the only one left behind.
2020-11-09 12:20:03 -07:00
Adam Locke
1fa232eb2a
[DOCS] [7.x] Combining important config settings into a single page (#63849) (#63884)
* [DOCS] Combining important config settings into a single page (#63849)

* Combining important config settings into a single page.

* Updating ids for two pages causing link errors and implementing redirects.

* Updating links to use IDs instead of xrefs.
2020-10-19 12:59:31 -04:00
Andrei Dan
22f843352e
DOCS: general overview of data tiers and roles (#63086) (#63421)
This adds general overview documentation for data tiers,
the data tiers specific node roles, and their application in
ILM.

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: debadair <debadair@elastic.co>
(cherry picked from commit d588cab747)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: debadair <debadair@elastic.co>
2020-10-07 17:29:12 +01:00
Howard
a914d8bc90
[DOCS] Remove duplicate disk.threshold_enabled setting (#62925) 2020-09-29 09:13:21 -04:00
Jay Modi
cb1dc5260f
Dedicated threadpool for system index writes (#62792)
This commit adds a dedicated threadpool for system index write
operations. The dedicated resources for system index writes serves as
a means to ensure that user activity does not block important system
operations from occurring such as the management of users and roles.

Backport of #61655
2020-09-22 15:31:38 -06:00