elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 07:37:19 -04:00

Author	SHA1	Message	Date
James Rodewig	950eb775fe	[DOCS] Correct yaml syntax in example configuration (#82297 ) (#82392 ) (cherry picked from commit `432fd79c46`) Co-authored-by: mymindstorm <mymindstorm@evermiss.net>	2022-01-10 17:19:27 -05:00
James Rodewig	7142b47e69	[DOCS] Add prerequisites for CCS (#81782 ) * Adds a prerequisites section covering remote cluster config, node roles, and security. * Moves existing content about remote cluster config to the prereqs. * Updates the remote cluster docs to include information about eligible gateway nodes and tagging for gateway nodes. Closes https://github.com/elastic/elasticsearch/issues/72001	2022-01-10 09:17:44 -05:00
Stef Nestor	e2d66cd257	[DOCS] Thread pool settings are static (#81887 ) Starting in 5.1 Thread Pools can no longer be dynamically updated, [doc](https://www.elastic.co/guide/en/elasticsearch/reference/5.0/breaking_50_settings_changes.html#_threadpool_settings).	2021-12-20 11:20:06 -05:00
Leaf-Lin	82592c4268	[DOCS] Update remote cluster version compatibility table for 8.x (#81239 ) Updates the remote clusters version compatibility table to include 7.17 and 8.x versions. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-12-16 11:16:24 -05:00
David Turner	5b9ce9e820	Remove dead code from same-shard decider (#81520 ) Today the same-shard allocation decider falls back to checking the hostname if the node has no host address. In practice nodes will always have an address so the fallback is dead code. This commit removes that dead code. Relates #80702 which will add the ability to distinguish nodes by hostname regardless of whether they have an address or not, and #80767 which optimizes this area of code - this refactoring should make the optimization simpler.	2021-12-09 08:42:25 +00:00
David Turner	7dd32fb027	Reduce verbosity-increase timeout to 3m (#81118 ) Today we increase the verbosity of discovery failures after 5 minutes without a master. Unfortunately 5 minutes is a common orchestration timeout, so if discovery is broken then we see nodes being shut down just before they start to emit useful logs. This commit reduces the default timeout to 3 minutes to address that.	2021-11-30 09:52:39 +00:00
David Turner	8cf4c7b6fb	Remove last few mentions of Zen discovery (#80410 ) We have a few leftover mentions of `zen` discovery, mostly for historical/BwC reasons, which this commit removes. Prior to this commit the default value for `discovery.type` was `zen` but this was not written down anywhere or officially supported: the two options were to set it to `single-node` or to omit it entirely. This commit changes the default to `multi-node` and documents this. Co-authored-by: Adam Locke <adam.locke@elastic.co>	2021-11-09 09:52:06 +01:00
Stuart Tettemer	30e15ba838	Script: Time series compile and cache evict metrics (#79078 ) Collects compilation and cache eviction metrics for each script context. Metrics are available in _nodes/stats in 5m/15m/1d buckets. Refs: #62899	2021-11-03 13:13:42 -05:00
David Turner	6cc0a41af0	Expand warning about modifying data path contents (#79649 ) Today we have a short note in one place in the docs saying not to touch the contents of the data path. This commit expands the warning to describe more precisely what is forbidden, and to give some more detail of the consequences, and also duplicates the warning to the other location that documents the `path.data` setting.	2021-10-21 16:28:43 -04:00
Stuart Tettemer	808b70d2f9	Script: Restore the scripting general cache (#79453 ) Deprecate the script context cache in favor of the general cache. Users should use the following settings: `script.max_compilations_rate` to set the max compilation rate for user scripts such as filter scripts. Certain script contexts that submit scripts outside of the control of the user are exempted from this rate limit. Examples include runtime fields, ingest and watcher. `script.cache.max_size` to set the max size of the cache. `script.cache.expire` to set the expiration time for entries in the cache. Whats deprecated? `script.max_compilations_rate: use-context`. This special setting value was used to turn on the script context-specific caches. `script.context.$CONTEXT.cache_max_size`, use `script.cache.max_size` instead. `script.context.$CONTEXT.cache_expire`, use `script.cache.expire` instead. `script.context.$CONTEXT.max_compilations_rate`, use `script.max_compilations_rate` instead. The default cache size was increased from `100` to `3000`, which was approximately the max cache size when using context-specific caches. The default compilation rate limit was increased from `75/5m` to `150/5m` to account for increasing uses of scripts. System script contexts can now opt-out of compilation rate limiting using a flag rather than a sentinel rate limit value. 7.16: Script: Deprecate script context cache #79508 Refs: #62899 7.16: Script: Opt-out system contexts from script compilation rate limit #79459 Refs: #62899	2021-10-21 07:57:27 -05:00
Francisco Fernández Castaño	2b4fe8fc7b	Limit concurrent snapshot file restores in recovery per node (#79316 ) Today we limit the max number of concurrent snapshot file restores per recovery. This works well when the default node_concurrent_recoveries is used (which is 2). When this limit is increased, it is possible to exhaust the underlying repository connection pool, affecting other workloads. This commit adds a new setting `indices.recovery.max_concurrent_snapshot_file_downloads_per_node` that allows to limit the max number of snapshot file downloads per node during recoveries. When a recovery starts in the target node it tries to acquire a permit that allows it to download snapshot files when it is granted. This is communicated to the source node in the StartRecoveryRequest. This is a rather conservative approach since it is possible that a recovery that gets a permit to use snapshot files doesn't recover any snapshot file while there's a concurrent recovery that doesn't get a permit could take advantage of recovering from a snapshot. Closes #79044	2021-10-18 18:17:27 +02:00
Yannick Welsch	13487b1ed6	Node level can match action (#78765 ) Changes can-match from a shard-level to a node-level action, which helps avoid an explosion of shard-level can-match subrequests in clusters with many shards, that can cause stability issues. Also introduces a new search_coordination thread pool to handle the sending and handling of node-level can-match requests.	2021-10-18 10:13:44 +02:00
Nikola Grcevski	055c770083	Deprecation of transient cluster settings (#78794 ) This PR changes uses of transient cluster settings to persistent cluster settings. The PR also deprecates the transient settings usage. Relates to #49540	2021-10-15 13:00:52 -04:00
Henning Andersen	57e503ca78	[DOCS] disk.threshold_enabled not cloud (#79225 ) Mark `cluster.routing.allocation.disk.threshold_enabled` not for cloud and add it to list of operator only settings. Relates #78822	2021-10-15 16:19:04 +02:00
Adam Locke	529986e9b1	A typo error (#78987 ) (#79203 ) * A typo error a space between 'E' and 'cluster...' * Update example, fix headings, change notes Co-authored-by: Adam Locke <adam.locke@elastic.co> Co-authored-by: Marwane Chahoud <marwane.chahoud@gmail.com>	2021-10-15 08:52:03 -04:00
Adam Locke	c3b67ee0ae	[DOCS] Fix default value for closed indices (#78924 ) * [DOCS] Fix default value for closed indices #57953 introduced changes that added ESS icons to many Elasticsearch settings. As part of those changes, the default value for `cluster.indices.close.enable` was indicated as `false`, when it should be `true`. This PR updates the default value to `true`. Closes #78877 * Update description * Update note to remove outdated claims	2021-10-13 08:14:01 -04:00
Samuel Nelson	c4f5d41fe7	[DOCS] Update ESS support for `stack.templates.enabled` (#78732 ) The documentation indicates that `stack.templates.enabled` can be used in Elasticsearch Service, but it is not part of the settings allowlist in ESS. This PR makes the documentation match the state of the allowlist.	2021-10-06 09:37:30 -04:00
David Turner	07a2acac93	Improve docs for pre-release version compatibility (#78428 ) * Improve docs for pre-release version compatibility Follow-up to #78317 clarifying a couple of points: - a pre-release build can restore snapshots from released builds - compatibility applies if at least one of the local or remote cluster is a released build * Remote cluster build date nit	2021-09-29 04:49:07 -04:00
David Turner	4782cf4d91	Add docs for pre-release version compatibility (#78317 ) The reference manual includes docs on version compatibility in various places, but it's not clear that these docs only apply to released versions and that the rules for pre-release versions are stricter than folks expect. This commit adds some words to the docs for unreleased versions which explains this subtlety.	2021-09-27 16:56:35 +01:00
Adam Locke	6940673e8a	[DOCS] Update remote cluster docs (#77043 ) * [DOCS] Update remote cluster docs * Add files, rename files, write new stuff * Plethora of changes * Add test and update snippets * Redirects, moved files, and test updates * Moved file to x-pack for tests * Remove older CCS page and add redirects * Cleanup, link updates, and some rewrites * Update image * Incorporating user feedback and rewriting much of the remote clusters page * More changes from review feedback * Numerous updates, including request examples for CCS and Kibana * More changes from review feedback * Minor clarifications on security for remote clusters * Incorporate review feedback Co-authored-by: Yang Wang <ywangd@gmail.com> * Some review feedback and some editorial changes Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Yang Wang <ywangd@gmail.com>	2021-09-22 16:02:33 -04:00
James Rodewig	2b2f0e1d7f	[DOCS] Remove the `listener` thread pool (#78194 ) Changes: * Removes docs for the `listener` thread pool * Adds an 8.0 breaking change for the thread pool removal Relates to #53314 and #53049	2021-09-22 13:41:05 -04:00
AndyHunt66	a5030ef407	[DOCS] Fix typo for `script.painless.regex.enabled` setting value (#77853 ) The value is `limited`, not `limit`.	2021-09-16 13:59:58 -04:00
Johan Nilsson Hansen	553e8dcb07	Create a sha-256 hash of the shard request cache key (#74877 ) We currently use the plaintext body of a shard request as the key to the request cache. This has the disadvantage that very large requests can quickly fill up the cache due to the size of their keys. With this commit, we instead use a sha-256 hash of the shard request as the cache key, which will use a constant (and much smaller) number of bytes.	2021-09-13 08:55:59 +01:00
David Turner	1045abe71f	Limit count of HTTP channels with tracked stats (#77303 ) Today we expire the client stats for HTTP channels 5 minutes after they close. It's possible to open a very large number of HTTP channels in 5 minutes, possibly inadvertently, and the stats for those channels can be overwhelming. This commit introduces a limit on the number of channels tracked by each node which applies in addition to the age limit, and makes these limits configurable via static settings. It drops the pruning of old stats when starting to track a new channel and instead uses a queue to expire the oldest stats when each channel closes if necessary to respect the count limit; it only performs age-based expiry when retrieving the stats, since the count limit now bounds the memory needed. Finally, it tightents up some missing synchronization and makes sure that we expose only immutable objects to the stats subsystem.	2021-09-08 07:25:57 +01:00
Howard	4432b39112	[DOCS] Fix formatting for `snapshot_meta` thread pool (#76973 )	2021-08-26 10:36:26 -04:00
Martijn van Groningen	8a1deff75a	Improve fault-detection.asciidoc (#76821 ) Add section to fault-detection.asciidoc about nodes being removed from cluster due to slow cluster state applying.	2021-08-23 14:31:06 +02:00
Tim Brooks	673e8e17f4	Enable LZ4 transport compression by default (#76326 ) This commit enables LZ4 transport compression by default at the indexing_data level. Relates to #73497.	2021-08-17 12:19:42 -06:00
Tim Brooks	e6fd459a6e	Respond with same compression scheme received (#76372 ) This is related to #73497. Currently, we only use the configured transport.compression_scheme setting when compressing a request or a response. Additionally, the cluster.remote.*.compression_scheme setting is ignored. This commit fixes this behavior by respecting the per-cluster setting. Additionally, it resolves confusion around inbound and outbound connections by always responding with the same scheme that was received. This allows remote connections to have different schemes than local connections.	2021-08-13 13:29:22 -06:00
Francisco Fernández Castaño	2ebe5cd075	Add peer recoveries using snapshot files when possible (#76237 ) This commit adds peer recoveries from snapshots. It allows establishing a replica by downloading file data from a snapshot rather than transferring the data from the primary. Enabling this feature is done on the repository definition. Repositories having the setting `use_for_peer_recovery=true` will be consulted to find a good snapshot when recovering a shard. Relates #73496	2021-08-13 10:42:16 +02:00
Tim Brooks	425b7b280b	Add docs for production ready compression settings (#76441 ) In 7.15, we intend for the indexing_data compression level and the compression scheme lz4 to no longer be experimental. This commit updates the documentation to reflect this. Additionally, it adds missing docs for the cluster.remote.*.transport.compression_scheme setting. Relates to #73497.	2021-08-12 16:48:56 -06:00
David Turner	e6a39e6ddc	Add note on special network values docs (#75779 ) The special values `_global_`, `_site_`, `0.0.0.0` and so on may resolve to multiple addresses, of which one is chosen to be the publish address. This commit generalises the warning about reachability as applied to DNS-resolved hostnames to also apply to these special values. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-08-09 17:05:54 +01:00
Francisco Fernández Castaño	3c8b9a6f2e	Add peer recovery planners that take into account available snapshots (#75840 ) This commit adds a new set of classes that would compute a peer recovery plan, based on source files + target files + available snapshots. When possible it would try to maximize the number of files used from a snapshot. It uses repositories with `use_for_peer_recovery` setting set to true. It adds a new recovery setting `indices.recovery.use_snapshots` Relates #73496	2021-08-09 14:03:12 +02:00
James Rodewig	5252995b48	[DOCS] Document regex circuit breaker (#76048 ) Documents the `script.painless.regex.enabled` and `script.painless.regex.limit-factor` cluster settings. Relates to #63029. Closes #75199.	2021-08-04 16:37:29 -04:00
Adrien Grand	feb6620d14	`indices.query.bool.max_clause_count` now limits all query clauses (#75297 ) In the upcoming Lucene 9 release, `indices.query.bool.max_clause_count` is going to apply to the entire query tree rather than per `bool` query. In order to avoid breaks, the limit has been bumped from 1024 to 4096. The semantics will effectively change when we upgrade to Lucene 9, this PR is only about agreeing on a migration strategy and documenting this change. To avoid further breaks, I am leaning towards keeping the current setting name even though it contains `bool`. I believe that it still makes sense given that `bool` queries are typically the main contributors to high numbers of clauses. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-07-21 12:16:30 +02:00
David Turner	68db1fd780	Advise away from a ping schedule on remote connxns (#75513 ) Today the docs for remote cluster connections use `ping_schedule` fairly liberally, and don't mention that you should prefer TCP keepalives wherever possible. This commit reduces the use of this setting in the examples and adjusts the description of the setting to include a note about TCP keepalives instead.	2021-07-20 19:09:13 +01:00
James Rodewig	76938006ab	[DOCS] Note required node roles and data tiers (#74566 ) Closes #74528 and #74565	2021-07-07 09:57:32 -04:00
Lisa Cawley	9ab6808206	[DOCS] Clean up xpack.ml.enabled details (#74573 )	2021-06-30 09:34:46 -07:00
Tim Brooks	293d490ded	Add additional transport compression options (#74587 ) This commit is related to #73497. It adds two new settings. The first setting is transport.compression_scheme. This setting allows the user to configure LZ4 or DEFLATE as the transport compression. Additionally, it modifies transport.compress to support the value indexing_data. When this setting is set to indexing_data only messages which are primarily composed of raw source data will be compressed. This is bulk, operations recovery, and shard changes messages.	2021-06-29 12:14:47 -06:00
James Rodewig	b207aac9ed	[DOCS] Increase `search.max_bucket` default value by one Relates to #70645.	2021-06-29 08:38:24 -04:00
François-Clément Brossard	0ea7cbd429	[DOC] Add watcher to the threadpool doc (#73935 ) Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com> Co-authored-by: Joe Gallo <joegallo@gmail.com>	2021-06-16 11:07:36 -04:00
David Turner	3660d863db	Fork the sending of file chunks during recovery (#74164 ) Today if sending file chunks is CPU-bound (e.g. when using compression) then we tend to concentrate all that work onto relatively few threads, even if `indices.recovery.max_concurrent_file_chunks` is increased. With this commit we fork the transmission of each chunk onto its own thread so that the CPU-bound work can happen in parallel.	2021-06-16 11:58:13 +01:00
David Turner	43ddd4a580	Fix docs rendering around recovery rate table (#73879 ) - Replaces ⇐ with ≤ - Removes table caption - Adjust table headers - Fixes leading + on subsequent paragraphs	2021-06-08 15:00:00 +01:00
Henning Andersen	a11e6f5c6e	Breaking change for single data node setting (#73737 ) In #55805, we added a setting to allow single data node clusters to respect the high watermark. In #73733 we added the related deprecations. This commit ensures the only valid value for the setting is true and adds deprecations if the setting is set. The setting will be removed in a future release. Co-authored-by: David Turner <david.turner@elastic.co>	2021-06-07 13:12:04 +02:00
William Brafford	1c295a92d8	Add threadpool for critical operations on system indices (#72625 ) * Add new thread pool for critical operations * Split critical thread pool into read and write * Add POJO to hold thread pool names * Add tests for critical thread pools * Add thread pools to data streams * Update settings for security plugin * Retrieve ExecutorSelector from SystemIndices where possible * Use a singleton ExecutorSelector	2021-06-03 12:07:37 -04:00
Luca Belluccini	3e41d753e3	[DOCS] Note circuit breakers reject requests with 429 HTTP status code (#69864 ) We mention Elasticsearch returns 429 if the circuit breaker trips in https://www.elastic.co/blog/improving-node-resiliency-with-the-real-memory-circuit-breaker, but there is no mention in the docs. This adds an xref to circuit breaker errors section. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-06-02 10:31:24 -04:00
Armin Braun	da242856fd	Introduce SNAPSHOT_META Threadpool for Fetching Repository Metadata (#73172 ) Adds new snapshot meta pool that is used to speed up the get snapshots API by making `SnapshotInfo` load in parallel. Also use this pool to load `RepositoryData`. A follow-up to this would expand the use of this pool to the snapshot status API and make it run in parallel as well.	2021-05-18 14:40:39 +02:00
David Turner	eabe2d1b34	Increase PeerFinder verbosity on persistent failure (#73128 ) If a node is partitioned away from the rest of the cluster then the `ClusterFormationFailureHelper` periodically reports that it cannot discover the expected collection of nodes, but does not indicate why. To prove it's a connectivity problem, users must today restart the node with `DEBUG` logging on `org.elasticsearch.discovery.PeerFinder` to see further details. With this commit we log messages at `WARN` level if the node remains disconnected for longer than a configurable timeout, which defaults to 5 minutes. Relates #72968	2021-05-17 10:52:18 +01:00
James Rodewig	dbad9d0a0d	[DOCS] Update 'shared_cache' references for searchable snapshots (#72775 )	2021-05-05 17:49:15 -04:00
Luca Belluccini	647ba8f124	[DOCS] Clarify remote_cluster_client is required to run ML (#72569 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2021-05-04 18:25:42 -07:00
David Turner	12b60f64ba	Trivial typo: bindiing -> binding	2021-04-27 12:20:46 +01:00

1 2 3 4 5 ...

892 commits