Commit graph

824 commits

Author SHA1 Message Date
David Turner
bb3ea99850
Skip zone/host awareness with auto-expand replicas (#69334)
Today if an index is set to `auto_expand_replicas: N-all` then we will
try and create a shard copy on every node that matches the applicable
allocation filters. This conflits with shard allocation awareness and
the same-host allocation decider if there is an uneven distribution of
nodes across zones or hosts, since these deciders prevent shard copies
from being allocated unevenly and may therefore leave some unassigned
shards.

The point of these two deciders is to improve resilience given a limited
number of shard copies but there is no need for this behaviour when the
number of shard copies is not limited, so this commit supresses them in
that case.

Closes #54151
Closes #2869
2021-02-22 16:53:58 +00:00
James Rodewig
9b88ae92e6
[DOCS] Fix typos for duplicate words (#69125) 2021-02-17 10:34:20 -05:00
James Rodewig
5f3542a28e
[DOCS] Add data_frozen role to node docs (#68713) 2021-02-08 17:43:47 -05:00
Lee Hinman
3f9f007545
Add the frozen tier node role and ILM phase (#68605)
This commit adds the `data_frozen` node role as part of the formalization of data tiers. It also
adds the `"frozen"` phase to ILM, currently allowing the same actions as the existing cold phase.

The frozen phase is intended to be used for data even less frequently searched than the cold phase,
and will eventually be loosely tied to data using partial searchable snapshots (as oppposed to full
searchable snapshots in the cold phase).

Relates to #60848
2021-02-05 14:38:13 -07:00
Jason Tedor
6e94e67ae9
Set recovery rate for dedicated cold nodes (#68480)
This commit sets the recovery rate for dedicated cold nodes. The goal is
here is enhance performance of recovery in a dedicated cold tier, where
we expect such nodes to be predominantly using searchable snapshots to
back the indices located on them. This commit follows a simple approach
where we increase the recovery rate as a function of the node size, for
nodes that appear to be dedicated cold nodes.
2021-02-04 10:36:07 -05:00
Tianlun Li
b0d185bb0d
Remove deprecated gateway settings (#53845)
This commit removes the following deprecated settings in v8:

- `gateway.expected_nodes`
- `gateway.expected_master_nodes`
- `gateway.recover_after_nodes`
- `gateway.recover_after_master_nodes`

Co-authored-by: ShawnLi1014 <shawnli1014@gmail.com>
2021-02-03 14:10:45 +00:00
James Rodewig
4a2a97a058
[DOCS] Document the stack.templates.enabled setting (#68328) 2021-02-02 08:35:21 -05:00
Adam Locke
c7855c2657
[DOCS] Minor rewording for HTTP settings (#68295)
* [DOCS] Minor rewording for HTTP settings.

* Revert "[DOCS] Minor rewording for HTTP settings."

This reverts commit 9a831adca6.

* Adds advanced wording to HTTP & transport settings.
2021-02-01 12:41:42 -05:00
David Turner
2adeb4a666
Expand and consolidate networking docs (#68051)
Today's network config docs are split into "Network", "HTTP" and
"Transport" pages, with unclear relationships between them. We often
encounter users with weird configs that indicate they don't really
understand how these settings all relate. In fact these pages are all
very interrelated, and the HTTP and Transport pages are almost all only
for advanced users. This commit brings these docs into a single page and
rewords some things to try and guide users away from the advanced
settings unless their configuration needs all the extra complexity.

It also adds a section entitled "Binding and publishing" which clarifies
the meanings of the `bind_host` and `publish_host` parameters. This is
also a common source of confusion amongst users.

It also clarifies that many of these settings accept a list of
addresses, and warns that this may not be what you want. Closes #67956.

Co-authored-by: Adam Locke <adam.locke@elastic.co>
2021-02-01 13:06:20 +00:00
David Turner
9c100cdeae
Extend default probe connect/handshake timeouts (#68059)
Today the discovery phase has a short 1-second timeout for handshaking
with a remote node after connecting, which allows it to quickly move on
and retry in the case of connecting to something that doesn't respond
straight away (e.g. it isn't an Elasticsearch node).

This short timeout was necessary when the component was first developed
because each connection attempt would block a thread. Since #42636 the
connection attempt is now nonblocking so we can apply a more relaxed
timeout.

If transport security is enabled then our handshake timeout applies to
the TLS handshake followed by the Elasticsearch handshake. If the TLS
handshake alone takes over a second then the whole handshake times out
with a `ConnectTransportException`, but this does not tell us which of
the two individual handshakes took so long.

TLS handshakes have their own 10-second timeout, which if reached yields
a `SslHandshakeTimeoutException` that allows us to distinguish a problem
at the TLS level from one at the Elasticsearch level. Therefore this
commit extends the discovery probe timeouts.
2021-01-27 16:41:44 +00:00
Lisa Cawley
4d1abd1494
[DOCS] Clarifies default ML and transform node settings (#67671) 2021-01-19 14:19:37 -08:00
Yang Cheng
168d98b7dd
limit the depth of nested bool queries (#66204)
limit the depth of nested bool queries 

Introduce a new node level setting `indices.query.bool.max_nested_depth`
that controls the depth of nested bool queries.
Throw an error if a nested depth of a bool query exceeds the maximum
allowed nested depth.

Closes #55303
2021-01-12 09:36:09 -05:00
Nik Everett
3e3152406a
Bust the request cache when the mapping changes (#66295)
This makes sure that we only serve a hit from the request cache if it
was build using the same mapping and that the same mapping is used for
the entire "query phase" of the search.

Closes #62033
2020-12-23 13:19:02 -05:00
Lisa Cawley
6b463a7b7a
[DOCS] Clarify use of CCS on ML nodes (#66616)
Co-authored-by: David Roberts <dave.roberts@elastic.co>
2020-12-22 10:11:09 -08:00
James Rodewig
b5d2d30599
[DOCS] Remove duplicate word (#66320) (#66446)
Co-authored-by: Gao Ruifeng <gaoruifeng@users.noreply.github.com>
2020-12-16 10:49:46 -05:00
James Rodewig
77dc63b2de
[DOCS] Fix search.max_buckets default (#66311) 2020-12-14 21:55:27 -05:00
David Turner
f6f4260024
Clarify network interface setting (#66013)
Today we document the use of `_[networkInterface]_` to specify the
addresses of a network interface but do not spell out which parts of
this syntax should be taken literally and which are part of the
placeholder for the interface name. If you get it wrong then the
exception message is confusing too since it uses the results of
`NetworkInterface#toString()` which contains much more than just the
name of the interface.

This commit clarifies the docs and the exception message.

Closes #65978.
2020-12-09 08:41:34 +00:00
James Rodewig
e3f6adf2d1
[DOCS] Fix wording for HTTP settings (#65964) 2020-12-07 12:18:55 -05:00
David Turner
aa4ab0bc26
Expand docs on disk-based shard allocation (#65668)
Today we document the settings used to control rebalancing and
disk-based shard allocation but there isn't really any discussion around
what these processes do so it's hard to know what, if any, adjustments
to make.

This commit adds some words to help folk understand this area better.
2020-12-07 14:51:26 +00:00
Henning Andersen
8fa1eea6f6
Searchable snapshot terminology (#65549)
We chose to use searchable snapshot index over snapshot-backed index, so
changed terminology towards this in a couple places.
2020-11-30 17:14:47 +01:00
Wylie Conlon
10ee0f2878
Clarify field data cache behavior in docs (#64375)
* Clarify that field data cache includes global ordinals
* Describe that the cache should be cleared once the limit is reached
* Clarify that the `_id` field does not supported aggregations anymore
* Fold the `fielddata` mapping parameter page into the `text field docs
* Improve cross-linking
2020-11-20 13:53:23 -08:00
James Rodewig
f95a52f280
[DOCS] Clarify diff between shards per node settings (#64875)
Clarifies differences between the
`cluster.routing.allocation.total_shards_per_node` and
`cluster.max_shards_per_node` cluster settings.

Closes #51839

Co-authored-by: Gordon Brown <arcsech@gmail.com>
2020-11-16 08:33:04 -05:00
Leaf-Lin
2bf3e36144 remove node.ingest setting in the documentation (#64456)
I'm not sure if this setting was left here deliberately? or by accident?
With all other node role definition has changed syntax from `node.xxx` to `node.roles: [ ]`, the ingest one is the only one left behind.
2020-11-09 12:21:43 -07:00
Adam Locke
789ee2d73e
[DOCS] Combining important config settings into a single page (#63849)
* Combining important config settings into a single page.

* Updating ids for two pages causing link errors and implementing redirects.
2020-10-19 10:02:22 -04:00
Andrei Dan
d588cab747
DOCS: general overview of data tiers and roles (#63086)
This adds general overview documentation for data tiers, 
the data tiers specific node roles, and their application in
ILM.

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: debadair <debadair@elastic.co>
2020-10-07 17:06:54 +01:00
Howard
e50799bc7e
[DOCS] Remove duplicate disk.threshold_enabled setting. (#62924) 2020-09-29 08:58:46 -04:00
Jay Modi
242083a36e
Dedicated threadpool for system index writes (#61655)
This commit adds a dedicated threadpool for system index write
operations. The dedicated resources for system index writes serves as
a means to ensure that user activity does not block important system
operations from occurring such as the management of users and roles.
2020-09-22 12:14:45 -06:00
Adam Locke
71b24db8f2
[DOCS] Add remote node as a node role (#62730)
* Adding remote node as a node role.

* Incorporating reviewer feedback.
2020-09-22 11:39:58 -04:00
James Rodewig
f8c013d0fb
[DOCS] Clarify http.max_content_length def (#62562) 2020-09-17 12:49:18 -04:00
Adam Locke
8375631451
[DOCS] Clarifying remote clusters based on feedback from Support (#62335)
* Clarifying remote clusters based on feedback from Support.

* Apply suggestions from code review

* Making additional editorial changes.
2020-09-15 11:43:35 -04:00
Varun Sharma
22b632a2ca
[DOCS] Fix node roles typo (#62307) 2020-09-14 10:08:44 -04:00
James Rodewig
dcf0c3062f
[DOCS] Document dynamic discovery settings (#61420) 2020-09-04 10:56:17 -04:00
James Rodewig
bbcd8078ce
[DOCS] Document dynamic index mgmt and buffer settings (#61753) 2020-09-04 10:19:42 -04:00
James Rodewig
a70c00a62c
[DOCS] Document dynamic cluster settings (#61760)
Co-authored-by: Adam Locke <adam.locke@elastic.co>
2020-09-01 15:48:45 -04:00
James Rodewig
617652b969
[DOCS] Document dynamic cluster-lvl shard alloc settings (#61338) 2020-08-31 11:04:11 -04:00
James Rodewig
d077a4f5a1
[DOCS] Document static field cache settings (#61424) 2020-08-26 17:10:08 -04:00
James Rodewig
3b94247bc7
[DOCS] Document static HTTP settings (#61429) 2020-08-25 11:10:20 -04:00
gadekishore
fc50e17753
updated shard limit doc (#56496)
* updated shard limit doc

As the documentation was not so clear. I have updated saying this limit includes open indices with unassigned primaries and replicas count towards the limit.

* [DOCS] Incorporated edits.

Co-authored-by: Deb Adair <debadair@elastic.co>
2020-08-24 16:41:04 -07:00
James Rodewig
fdc4e83050
[DOCS] Combine Search your data files (#61477)
No-op changes to:

* Move `Search your data` source files into the same directory
* Rename `Search your data` source files based on page ID
* Remove unneeded includes
* Remove the `Request` dir
2020-08-24 11:22:56 -04:00
James Rodewig
8359232c45
[DOCS] Document dynamic circuit breaker settings (#61334) 2020-08-19 10:58:04 -04:00
István Zoltán Szabó
d089709be9
[DOCS] Clarifies node.roles settings (#61266) 2020-08-18 15:56:41 +02:00
Adam Locke
610a47c792
[DOCS] Update CCR docs to focus on Kibana (#60555)
* First crack at rewriting the CCR introduction.

* Emphasizing Kibana in configuring CCR (part one).

* Many more edits, plus new files.

* Fixing test case.

* Removing overview page and consolidating that information in the main page.

* Adding redirects for moved and deleted pages.

* Removing, consolidating, and adding redirects.

* Fixing duplicate ID in redirects and removing outdated reference.

* Adding test case and steps for recreating a follower index.

* Adding steps for managing CCR tasks in Kibana.

* Adding tasks for managing auto-follow patterns.

* Fixing glossary link.

* Fixing glossary link, again.

* Updating the upgrade information and other stuff.

* Apply suggestions from code review

* Incorporating review feedback.

* Adding more edits.

* Fixing link reference.

* Adding use cases for #59812.

* Incorporating feedback from reviewers.

* Apply suggestions from code review

* Incorporating more review comments.

* Condensing some of the steps for accessing Kibana.

* Incorporating small changes from reviewers.

Co-authored-by: debadair <debadair@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-08-17 15:36:54 -04:00
István Zoltán Szabó
dd5c5e0c58
[DOCS] Adds clarification to node roles (#61206) 2020-08-17 15:51:25 +02:00
James Rodewig
a94e5cb7c4
[DOCS] Replace Wikipedia links with attribute (#61171) 2020-08-17 09:44:24 -04:00
David Turner
ef12a9a218 Minor network docs fixes (#60905)
Followup to #60216, fixing the formatting of
`transport.tcp.reuse_address` and clarifying some wording around the
distinction between the transport and HTTP layers.
2020-08-13 13:08:02 +01:00
Yannick Welsch
0b517ddca6
Provide option to allow writes when master is down (#60605)
Elasticsearch currently blocks writes by default when a master is unavailable. The cluster.no_master_block setting allows
a user to change this behavior to also block reads when a master is unavailable. This PR introduces a way to now also still
allow writes when a master is offline. Writes will continue to work as long as routing table changes are not needed (as
those require the master for consistency), or if dynamic mapping updates are not required (as again, these require the
master for consistency).

Eventually we should switch the default of cluster.no_master_block to this new mode.
2020-08-12 16:37:32 +02:00
Jay Modi
8c51fc7e2d
System index reads in separate threadpool (#57936)
This commit introduces a new thread pool, `system_read`, which is
intended for use by system indices for all read operations (get and
search). The `system_read` pool is a fixed thread pool with a maximum
number of threads equal to lesser of half of the available processors
or 5. Given the combination of both get and read operations in this
thread pool, the queue size has been set to 2000. The motivation for
this change is to allow system read operations to be serviced in spite
of the number of user searches.

In order to avoid a significant performance hit due to pattern matching
on all search requests, a new metadata flag is added to mark indices
as system or non-system. Previously created system indices will have
flag added to their metadata upon upgrade to a version with this
capability.

Additionally, this change also introduces a new class, `SystemIndices`,
which encapsulates logic around system indices. Currently, the class
provides a method to check if an index is a system index and a method
to find a matching index descriptor given the name of an index.

Relates #50251
Relates #37867
2020-08-10 12:38:54 -06:00
David Turner
19eb922d9f
Remove join timeout (#60873)
There is no point in timing out a join attempt any more. Timing out and
retrying with the same master is pointless, and an in-flight join
attempt to one master no longer blocks attempts to join other masters.
This commit removes this unnecessary setting.

Relates #60872 in which this setting was deprecated.
2020-08-10 13:57:54 +01:00
James Rodewig
ba88f0bd6a
[DOCS] Move inner hits content to separate page (#60840)
Moves inner hits content from the deprecated 'Request Body Search'
chapter to a separate page.
2020-08-06 13:47:06 -04:00
James Rodewig
ae01606785
[DOCS] Replace twitter dataset in docs (#60604) 2020-08-03 12:49:56 -04:00