elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 23:57:20 -04:00

Author	SHA1	Message	Date
James Rodewig	a763a86a0d	[DOCS] Update ingest node pipeline refs (#78770 ) In https://github.com/elastic/kibana/pull/113783, we renamed Kibana's Ingest Pipelines feature to Ingest Pipelines. This updates screenshots and references for the feature. It also replaces a few remaining `ingest node pipeline` references.	2021-10-12 08:18:24 -04:00
Hendrik Muhs	df32157a99	[Transform][DOCS] remove 7.x related limitations (#78975 ) remove 7.x related limitations from limitations documentation for 8.x	2021-10-12 14:01:25 +02:00
Przemyslaw Gomulka	f5e4228bb3	Setting to disable x-opaque-id in logs throttling (#78911 ) Introduces a setting cluster.deprecation_indexing.x_opaque_id_used.enabled to disable use of x-opaque-id in RateLimitingFilter. This will be used for deprecation logs indexing and will not affect logging to files (it uses different instance of RateLimitingFilter with this flag enabled by default) Changes the indices backing a deprecation log data stream to be hidden. Refactors DeprecationHttpIT to be more reliable relates #76292 closes #77936	2021-10-12 12:55:28 +02:00
Andrei Stefan	47852146c1	Mention scoring characteristics (#78965 )	2021-10-12 12:11:57 +03:00
Roberto Seldner	9a9d209df6	Index prefixes for searchable snapshots (#78474 ) * Index prefixes for searchable snapshots added a note about how ILM managed indices are prefixed with "restored-" or "partial-" when they are either fully or partially mounted for searchable snapshots * Apply suggestions from code review Co-authored-by: debadair <debadair@elastic.co>	2021-10-11 17:08:09 -07:00
Adam Locke	2d433169e4	[DOCS] Clarify cold tier functionality (#78933 )	2021-10-11 16:24:21 -04:00
James Rodewig	e7ab7c82a7	[DOCS] Update `runs` syntax (#78922 ) Updates the EQL syntax docs for PR #78895.	2021-10-11 10:40:10 -04:00
xiaoping	7e08c6b98a	Data stream support read and write with custom routing and partition size (#74394 )	2021-10-11 07:14:15 -05:00
Lisa Cawley	3d6074b76e	[DOCS] Fixes typo in calendar API example (#78867 )	2021-10-07 17:51:14 -07:00
Lisa Cawley	df5dde5b3c	[DOCS] Fixes ML get calendars API (#78808 )	2021-10-07 12:22:11 -07:00
William Brafford	0a28c7cb91	Implement GET API for System Feature Upgrades (#78642 ) * Implement and test get feature upgrade status API * Add integration test for feature upgrade endpoint * Use constant enum for statuses * Add unit tests for transport class methods	2021-10-07 15:18:47 -04:00
Lee Hinman	6e875d0fa9	Add node REPLACE shutdown implementation (#76247 ) * WIP, basic implementation * Pull `if` branch into a variable * Remove outdated javadoc * Remove map iteration, use target name instead of id (whoops) * Remove streaming from isReplacementSource * Simplify getReplacementName * Only calculate node shutdowns if canRemain==false and forceMove==false * Move canRebalance comment in BalancedShardsAllocator * Rename canForceDuringVacate -> canForceAllocateDuringReplace * Add comment to AwarenessAllocationDecider.canForceAllocateDuringReplace * Revert changes to ClusterRebalanceAllocationDecider * Change "no replacement" decision message in NodeReplacementAllocationDecider * Only construct shutdown map once in isReplacementSource * Make node shutdowns and target shutdowns available within RoutingAllocation * Add randomization for adding the filter that is overridden in test * Add integration test with replicas: 1 * Go nuts with the verbosity of allocation decisions * Also check NODE_C in unit test * Test with randomly assigned shard * Fix test for extra verbose decision messages * Remove canAllocate(IndexMetadat, RoutingNode, RoutingAllocation) overriding * Spotless :\| * Implement 100% disk usage check during force-replace-allocate * Add rudimentary documentation for "replace" shutdown type * Use RoutingAllocation shutdown map in BalancedShardsAllocator * Add canForceAllocateDuringReplace to AllocationDeciders & add test * Switch from percentage to bytes in DiskThresholdDecider force check * Enhance docs with note about rollover, creation, & shrink * Clarify decision messages, add test for target-only allocation * Simplify NodeReplacementAllocationDecider.replacementOngoing * Start nodeC before nodeB in integration test * Spotleeeessssssss! You get me every time! * Remove outdated comment	2021-10-07 12:07:46 -04:00
Lisa Cawley	bcd75c3203	[DOCS] Fixes ML get scheduled events API (#78809 )	2021-10-07 08:34:58 -07:00
Keith Massey	4df15f5177	Changing name of shards field in node/stats api to shard_stats (#78531 ) If the _nodes/stats API received a level=shards request parameter, then the response would have two "shards" fields, which would cause problems with json parsers. This commit renames the "shards" field that currently only contains "total_count" to "shard_stats". Relates #78311 #75433	2021-10-06 17:19:04 -05:00
James Rodewig	7e5e05540f	[DOCS] Fix system index refs in restore tutorial (#78582 ) Fixes a couple of erroneous references related to system indices in the snapshot restore tutorial: * Calling the delete index API on `` will only delete some system indices, such as the `.security`. It won't delete others, such as `.geoip_databases`. Not all dot indices are system indices. Some are just hidden indices. Relates to #76929	2021-10-06 17:55:11 -04:00
debadair	248b2293f9	[DOCS] Add info about FIPS and Java 17 (#78580 ) * [DOCS] Updated breaking changes entry for Java 11.	2021-10-06 11:54:48 -07:00
Stef Nestor	ddc1a0df28	[DOCS] Add prod warning to composite agg (#78723 ) The composite aggregation is considered expensive. Users should perform load testing before deploying it in production. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-10-06 13:44:12 -04:00
Samuel Nelson	c4f5d41fe7	[DOCS] Update ESS support for `stack.templates.enabled` (#78732 ) The documentation indicates that `stack.templates.enabled` can be used in Elasticsearch Service, but it is not part of the settings allowlist in ESS. This PR makes the documentation match the state of the allowlist.	2021-10-06 09:37:30 -04:00
James Rodewig	dbb8a015ad	[DOCS] Fix typos in flattened field type docs	2021-10-05 14:15:07 -04:00
Bo Andersen	609a7321b2	[DOCS] Added missing backtick for code snippet (#78241 )	2021-10-05 14:10:08 -04:00
James Baiera	aa3d5109b1	Automatically install monitoring templates at plugin initialization (#78350 ) This PR adds a MonitoringIndexTemplateRegistry to the monitoring plugin which automatically installs all monitoring templates locally when the plugin is initialized. Exporters have been updated to no longer attempt installation of the monitoring templates, and instead will wait for the templates to become available before setting themselves as started. Some older functionality related to templates has been removed as well, such as the expectation that version 6 monitoring templates are installed, as well as the setting that controls their installation (xpack.monitoring.exporters.<EXPORTER>.index.template.create_legacy_templates).	2021-10-05 14:05:20 -04:00
Jack Conradson	2cf160f2c0	Remove deprecated code from stored scripts (#78643 ) This change removes several pieces of deprecated code from stored scripts. Stored scripts/templates are no longer allowed to be an empty and will throw an exception when used with PutStoredScript. ScriptMetadata will now drop any existing stored scripts that are empty with a deprecation warning in the case they have not been previously removed. The code field is now only allowed as source as part of a PutStoredScript JSON blob.	2021-10-05 10:41:39 -07:00
Alexander Reelsen	19d12f19f5	[DOCS] Add script note to nested query docs (#77431 ) As the script has only access to the nested document, this should be documented. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-10-05 10:32:20 -04:00
James Rodewig	a56065dc9f	[DOCS] Fix rollover API response body heading	2021-10-05 09:16:32 -04:00
James Rodewig	2893ea911b	[DOCS] Remove duplicate line from migration guide (#78688 )	2021-10-05 08:52:11 -04:00
István Zoltán Szabó	1971bd4591	[DOCS] Adds Transform alerts docs (#78185 )	2021-10-05 14:06:48 +02:00
James Rodewig	5c7fac77b3	[DOCS] Add Beats config example for ingest pipelines (#78633 ) * [DOCS] Add Beats config example for ingest pipelines The Elasticsearch ingest pipeline docs cover ingest pipelines for Fleet and Elastic Agent. However, the docs don't cover Beats. This adds those docs. Relates to https://github.com/elastic/beats/pull/28239. * Update docs/reference/ingest.asciidoc Co-authored-by: DeDe Morton <dede.morton@elastic.co> Co-authored-by: DeDe Morton <dede.morton@elastic.co>	2021-10-05 05:47:50 -04:00
Alan Woodward	2de2bef4de	Remove indices_segments 'verbose' parameter (#78451 ) The 'verbose' option to /_segments returns memory information for each segment. However, lucene 9 has stopped tracking this memory information as it is largely held off-heap and so is no longer significant. This commit deprecates the 'verbose' parameter and makes it a no-op. Fixes #75955	2021-10-05 09:17:16 +01:00
Ignacio Vera	920b3b52c2	Add support for metrics aggregations to mvt end point (#78614 ) It adds support for several aggregations.	2021-10-05 09:17:25 +02:00
James Rodewig	fd30c6daf8	Add reference to PHP client on Bulk API page (#78558 ) (#78651 ) Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com> Co-authored-by: Christian Fratta <christian.fratta@gmail.com>	2021-10-04 17:42:42 -04:00
Joe Gallo	4a14f2f6f9	Validate that snapshot repository exists for ILM policies at creation/update time (#78468 )	2021-10-04 15:19:10 -04:00
Benjamin Trent	7a7fffcb5a	[ML] Text/Log categorization multi-bucket aggregation (#71752 ) This commit adds a new multi-bucket aggregation: `categorize_text` The aggregation follows a similar design to significant text in that it reads from `_source` and re-analyzes the the text as it is read. Key difference is that it does not use the indexed field's analyzer, but instead relies on the `ml_standard` tokenizer with specialized ML token filters. The tokenizer + filters are the same that machine learning categorization anomaly jobs utilize. The high level logical flow is as follows: - at each shard, read in the text field with a custom analyzer using `ml_standard` tokenizer - Read in the particular tokens from the analyzer - Feed these tokens to a token tree algorithm (an adaptation of the drain categorization algorithm) - Gather the individual log categories (the leaf nodes), sort them by doc_count, ship those buckets to be merged - Merge all buckets that have the EXACT same key - Once all buckets are merged, pass those keys + counts to a new token tree for additional merging - That tree builds the final buckets and that is returned to the user Algorithm explanation: - Each log is parsed with the ml-standard tokenizer - each token is passed into a token tree - For `max_match_token` each token is stored in the tree and at `max_match_token+1` (or `len(tokens)`) a log group is created - If another log group exists at that leaf, merge it if they have `similarity_threshold` percentage of tokens in common - merging simply replaces tokens that are different in the group with `` - If a layer in the tree has `max_unique_tokens` we add a `` child and any new tokens are passed through there. Catch here is that on the final merge, we first attempt to merge together subtrees with the smallest number of documents. Especially if the new sub tree has more documents counted. ## Aggregation configuration. Here is an example on some openstack logs ```js POST openstack/_search?size=0 { "aggs": { "categories": { "categorize_text": { "field": "message", // The field to categorize "similarity_threshold": 20, // merge log groups if they are this similar "max_unique_tokens": 20, // Max Number of children per token position "max_match_token": 4, // Maximum tokens to build prefix trees "size": 1 } } } } ``` This will return buckets like ```json "aggregations" : { "categories" : { "buckets" : [ { "doc_count" : 806, "key" : "nova-api.log.1.2017-05-16_13 INFO nova.osapi_compute.wsgi.server * HTTP/1.1 status len time" } ] } } ```	2021-10-04 11:49:16 -04:00
Stef Nestor	e0cb0beb73	[DOCS] Fix SLM status response (#78584 ) The get SLM status API will only return one of three statuses: `RUNNING`, `STOPPING`, or `STOPPED`. This corrects the docs to remove the `STARTED` status and document the `RUNNING` status. Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-10-04 09:41:17 -04:00
Tanguy Leroux	63d663e220	Add periodic maintenance task to clean up unused blob store cache docs (#78438 ) In #77686 we added a service to clean up blob store cache docs after a searchable snapshot is no more used. We noticed some situations where some cache docs could still remain in the system index: when the system index is not available when the searchable snapshot index is deleted; when the system index is restored from a backup or when the searchable snapshot index was deleted on a version before #77686. This commit introduces a maintenance task that periodically scans and cleans up unused blob cache docs. This task is scheduled to run every hour on the data node that contain the blob store cache primary shard. The periodic task works by using a point in time context with search_after.	2021-10-04 13:15:56 +02:00
James Rodewig	9e0299f551	[DOCS] Troubleshoot the flood-stage watermark error (#78519 ) Adds troubleshooting steps for the flood-stage watermark error. Closes #77906.	2021-10-01 08:32:53 -04:00
Ignacio Vera	e4cde37111	Add centroid grid type in mvt request (#78305 ) For this grid type, the features on the aggregation layer are represented by a point that is computed from the centroid of the data inside the cell Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>	2021-10-01 06:56:13 +02:00
James Rodewig	c33e340a47	[DOCS] EQL: Document `runs` keyword (#78478 ) (#78518 ) Documents the `runs` keyword for running the same event criteria successively in a sequence query. Relates to #75082. # Conflicts: # docs/reference/release-notes/highlights.asciidoc	2021-09-30 10:23:14 -04:00
Yannick Welsch	3dac76c190	Disk usage API does not support timeout parameters (#78503 ) Fixes the documentation that the disk usage API is not supporting timeout parameters. Closes #78356	2021-09-30 16:08:00 +02:00
James Rodewig	12019a89fd	[DOCS] Document archived settings (#78351 ) Documents `archived.*` persistent cluster settings and index settings. These settings are commonly produced during a major version upgrade. Closes #28027	2021-09-30 09:27:53 -04:00
debadair	7431a9656e	[DOCS] Fix erroneous page break. (#78487 )	2021-09-29 15:12:13 -07:00
William Brafford	8c2fe902f3	Feature upgrade rest stubs (#77827 ) * Add stubs for get API * Add stub for post API * Register new actions in ActionModule * HLRC stubs * Unit tests * Add rest api spec and tests * Add new action to non-operator actions list	2021-09-29 16:25:15 -04:00
Jack Conradson	086ba1aefb	Remove JodaCompatibleZonedDateTime (#78417 ) This change removes JodaCompatibleZonedDateTime and replaces it with ZonedDateTime for use in scripting. Breaking changes: * JodaCompatibleDateTime no longer exists and cannot be cast to in Painless. Use ZonedDateTime instead. * The dayOfWeek method on ZonedDateTime returns the DayOfWeek enum instead of an int from JodaCompatibleDateTime. dayOfWeekEnum still exists on ZonedDateTime as an augmentation to support the transition to ZonedDateTime, but is now deprecated in favor of dayOfWeek on ZonedDateTime.	2021-09-29 13:01:40 -07:00
Benjamin Trent	498e6e3d0f	[ML] adding docs for estimated heap and operations (#78376 ) Add docs for optionally supplying memory and operation estimates in put model	2021-09-29 09:11:42 -04:00
James Rodewig	4544ab2dbb	[DOCS] Always enable file and native realms unless explicitly disabled (#78405 ) * [DOCS] Always enable file and native realms by default Adds an 8.0 breaking change for PR #69096. The copy is based on the 7.13 deprecation notice added with PR #69320. * reword * Update docs/reference/migration/migrate_8_0/security.asciidoc Co-authored-by: Yang Wang <ywangd@gmail.com> * Update docs/reference/migration/migrate_8_0/security.asciidoc Co-authored-by: Yang Wang <ywangd@gmail.com> Co-authored-by: Yang Wang <ywangd@gmail.com>	2021-09-29 09:10:30 -04:00
James Rodewig	f4b5ef7416	[DOCS] Remove `include_type_name` query parameter (#78394 ) Adds an 8.0 breaking change for PR #48632.	2021-09-29 09:00:15 -04:00
Benjamin Trent	b96d929af3	[ML] add documentation for get deployment stats API (#78412 ) * [ML] add documentation for get deployment stats API * Apply suggestions from code review Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>	2021-09-29 07:20:25 -04:00
David Turner	07a2acac93	Improve docs for pre-release version compatibility (#78428 ) * Improve docs for pre-release version compatibility Follow-up to #78317 clarifying a couple of points: - a pre-release build can restore snapshots from released builds - compatibility applies if at least one of the local or remote cluster is a released build * Remote cluster build date nit	2021-09-29 04:49:07 -04:00
James Baiera	eafbd336c2	Remove Monitoring ingest pipelines (#77459 ) Monitoring installs a number of ingest pipelines which have been historically used to upgrade documents when mappings and document structures change between versions. Since there aren't any changes to the document format, nor will there be by the time the format is completely retired, we can comfortably remove these pipelines.	2021-09-28 16:10:02 -04:00
James Rodewig	58595e7af5	[DOCS] Searches on the `_type` field are no longer supported (#78400 ) Adds an 8.0 breaking change for PR #68564	2021-09-28 14:51:45 -04:00
Benjamin Trent	408489310c	[ML] add zero_shot_classification task for BERT nlp models (#77799 ) Zero-Shot classification allows for text classification tasks without a pre-trained collection of target labels. This is achieved through models trained on the Multi-Genre Natural Language Inference (MNLI) dataset. This dataset pairs text sequences with "entailment" clauses. An example could be: "Throughout all of history, man kind has shown itself resourceful, yet astoundingly short-sighted" could have been paired with the entailment clauses: ["This example is history", "This example is sociology"...]. This training set combined with the attention and semantic knowledge in modern day NLP models (BERT, BART, etc.) affords a powerful tool for ad-hoc text classification. See https://arxiv.org/abs/1909.00161 for a deeper explanation of the MNLI training and how zero-shot works. The zeroshot classification task is configured as follows: ```js { // <snip> model configuration </snip> "inference_config" : { "zero_shot_classification": { "classification_labels": ["entailment", "neutral", "contradiction"], // <1> "labels": ["sad", "glad", "mad", "rad"], // <2> "multi_label": false, // <3> "hypothesis_template": "This example is {}.", // <4> "tokenization": { /<snip> tokenization configuration </snip>/} } } } ``` * <1> For all zero_shot models, there returns 3 particular labels when classification the target sequence. "entailment" is the positive case, "neutral" the case where the sequence isn't positive or negative, and "contradiction" is the negative case * <2> This is an optional parameter for the default zero_shot labels to attempt to classify * <3> When returning the probabilities, should the results assume there is only one true label or multiple true labels * <4> The hypothesis template when tokenizing the labels. When combining with `sad` the sequence looks like `This example is sad.` For inference in a pipeline one may provide label updates: ```js { //<snip> pipeline definition </snip> "processors": [ //<snip> other processors </snip> { "inference": { // <snip> general configuration </snip> "inference_config": { "zero_shot_classification": { "labels": ["humanities", "science", "mathematics", "technology"], // <1> "multi_label": true // <2> } } } } //<snip> other processors </snip> ] } ``` * <1> The `labels` we care about, these replace the default ones if they exist. * <2> Should the results allow multiple true labels Similarly one may provide label changes against the `_infer` endpoint ```js { "docs":[{ "text_field": "This is a very happy person"}], "inference_config":{"zero_shot_classification":{"labels": ["glad", "sad", "bad", "rad"], "multi_label": false}} } ```	2021-09-28 09:38:23 -04:00

1 2 3 4 5 ...

9140 commits