Commit graph

9140 commits

Author SHA1 Message Date
James Rodewig
a763a86a0d
[DOCS] Update ingest node pipeline refs (#78770)
In https://github.com/elastic/kibana/pull/113783, we renamed Kibana's **Ingest Pipelines** feature to **Ingest Pipelines**. This updates screenshots and references for the feature. It also replaces a few remaining `ingest node pipeline` references.
2021-10-12 08:18:24 -04:00
Hendrik Muhs
df32157a99
[Transform][DOCS] remove 7.x related limitations (#78975)
remove 7.x related limitations from limitations documentation for 8.x
2021-10-12 14:01:25 +02:00
Przemyslaw Gomulka
f5e4228bb3
Setting to disable x-opaque-id in logs throttling (#78911)
Introduces a setting cluster.deprecation_indexing.x_opaque_id_used.enabled to disable use of
x-opaque-id in RateLimitingFilter. This will be used for deprecation
logs indexing and will not affect logging to files (it uses different
instance of RateLimitingFilter with this flag enabled by default)

Changes the indices backing a deprecation log data stream to be hidden.

Refactors DeprecationHttpIT to be more reliable

relates #76292
closes #77936
2021-10-12 12:55:28 +02:00
Andrei Stefan
47852146c1
Mention scoring characteristics (#78965) 2021-10-12 12:11:57 +03:00
Roberto Seldner
9a9d209df6
Index prefixes for searchable snapshots (#78474)
* Index prefixes for searchable snapshots

added a note about how ILM managed indices are prefixed with "restored-" or "partial-" when they are either fully or partially mounted for searchable snapshots

* Apply suggestions from code review

Co-authored-by: debadair <debadair@elastic.co>
2021-10-11 17:08:09 -07:00
Adam Locke
2d433169e4
[DOCS] Clarify cold tier functionality (#78933) 2021-10-11 16:24:21 -04:00
James Rodewig
e7ab7c82a7
[DOCS] Update runs syntax (#78922)
Updates the EQL syntax docs for PR #78895.
2021-10-11 10:40:10 -04:00
xiaoping
7e08c6b98a
Data stream support read and write with custom routing and partition size (#74394) 2021-10-11 07:14:15 -05:00
Lisa Cawley
3d6074b76e
[DOCS] Fixes typo in calendar API example (#78867) 2021-10-07 17:51:14 -07:00
Lisa Cawley
df5dde5b3c
[DOCS] Fixes ML get calendars API (#78808) 2021-10-07 12:22:11 -07:00
William Brafford
0a28c7cb91
Implement GET API for System Feature Upgrades (#78642)
* Implement and test get feature upgrade status API
* Add integration test for feature upgrade endpoint
* Use constant enum for statuses
* Add unit tests for transport class methods
2021-10-07 15:18:47 -04:00
Lee Hinman
6e875d0fa9
Add node REPLACE shutdown implementation (#76247)
* WIP, basic implementation

* Pull `if` branch into a variable

* Remove outdated javadoc

* Remove map iteration, use target name instead of id (whoops)

* Remove streaming from isReplacementSource

* Simplify getReplacementName

* Only calculate node shutdowns if canRemain==false and forceMove==false

* Move canRebalance comment in BalancedShardsAllocator

* Rename canForceDuringVacate -> canForceAllocateDuringReplace

* Add comment to AwarenessAllocationDecider.canForceAllocateDuringReplace

* Revert changes to ClusterRebalanceAllocationDecider

* Change "no replacement" decision message in NodeReplacementAllocationDecider

* Only construct shutdown map once in isReplacementSource

* Make node shutdowns and target shutdowns available within RoutingAllocation

* Add randomization for adding the filter that is overridden in test

* Add integration test with replicas: 1

* Go nuts with the verbosity of allocation decisions

* Also check NODE_C in unit test

* Test with randomly assigned shard

* Fix test for extra verbose decision messages

* Remove canAllocate(IndexMetadat, RoutingNode, RoutingAllocation) overriding

* Spotless :|

* Implement 100% disk usage check during force-replace-allocate

* Add rudimentary documentation for "replace" shutdown type

* Use RoutingAllocation shutdown map in BalancedShardsAllocator

* Add canForceAllocateDuringReplace to AllocationDeciders & add test

* Switch from percentage to bytes in DiskThresholdDecider force check

* Enhance docs with note about rollover, creation, & shrink

* Clarify decision messages, add test for target-only allocation

* Simplify NodeReplacementAllocationDecider.replacementOngoing

* Start nodeC before nodeB in integration test

* Spotleeeessssssss! You get me every time!

* Remove outdated comment
2021-10-07 12:07:46 -04:00
Lisa Cawley
bcd75c3203
[DOCS] Fixes ML get scheduled events API (#78809) 2021-10-07 08:34:58 -07:00
Keith Massey
4df15f5177
Changing name of shards field in node/stats api to shard_stats (#78531)
If the _nodes/stats API received a level=shards request parameter, then the response would have two "shards" fields,
which would cause problems with json parsers. This commit renames the "shards" field that currently only contains
"total_count" to "shard_stats".
Relates #78311 #75433
2021-10-06 17:19:04 -05:00
James Rodewig
7e5e05540f
[DOCS] Fix system index refs in restore tutorial (#78582)
Fixes a couple of erroneous references related to system indices in the snapshot restore tutorial:

* Calling the delete index API on `*` will only delete
  some system indices, such as the `.security`. It won't delete others, such as
  `.geoip_databases`.

* Not all dot indices are system indices. Some are just hidden indices.

Relates to #76929
2021-10-06 17:55:11 -04:00
debadair
248b2293f9
[DOCS] Add info about FIPS and Java 17 (#78580)
* [DOCS] Updated breaking changes entry for Java 11.
2021-10-06 11:54:48 -07:00
Stef Nestor
ddc1a0df28
[DOCS] Add prod warning to composite agg (#78723)
The composite aggregation is considered expensive. Users should perform load testing before deploying it in production.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-10-06 13:44:12 -04:00
Samuel Nelson
c4f5d41fe7
[DOCS] Update ESS support for stack.templates.enabled (#78732)
The documentation indicates that `stack.templates.enabled` can be used in Elasticsearch Service, but it is not part of the settings allowlist in ESS. This PR makes the documentation match the state of the allowlist.
2021-10-06 09:37:30 -04:00
James Rodewig
dbb8a015ad [DOCS] Fix typos in flattened field type docs 2021-10-05 14:15:07 -04:00
Bo Andersen
609a7321b2
[DOCS] Added missing backtick for code snippet (#78241) 2021-10-05 14:10:08 -04:00
James Baiera
aa3d5109b1
Automatically install monitoring templates at plugin initialization (#78350)
This PR adds a MonitoringIndexTemplateRegistry to the monitoring plugin which automatically 
installs all monitoring templates locally when the plugin is initialized. Exporters have been 
updated to no longer attempt installation of the monitoring templates, and instead will wait for 
the templates to become available before setting themselves as started. Some older 
functionality related to templates has been removed as well, such as the expectation that 
version 6 monitoring templates are installed, as well as the setting that controls their installation
(xpack.monitoring.exporters.<EXPORTER>.index.template.create_legacy_templates).
2021-10-05 14:05:20 -04:00
Jack Conradson
2cf160f2c0
Remove deprecated code from stored scripts (#78643)
This change removes several pieces of deprecated code from stored scripts.

Stored scripts/templates are no longer allowed to be an empty and will throw an exception when used 
with PutStoredScript.

ScriptMetadata will now drop any existing stored scripts that are empty with a deprecation warning in 
the case they have not been previously removed.

The code field is now only allowed as source as part of a PutStoredScript JSON blob.
2021-10-05 10:41:39 -07:00
Alexander Reelsen
19d12f19f5
[DOCS] Add script note to nested query docs (#77431)
As the script has only access to the nested document, this should be
documented.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-10-05 10:32:20 -04:00
James Rodewig
a56065dc9f [DOCS] Fix rollover API response body heading 2021-10-05 09:16:32 -04:00
James Rodewig
2893ea911b
[DOCS] Remove duplicate line from migration guide (#78688) 2021-10-05 08:52:11 -04:00
István Zoltán Szabó
1971bd4591
[DOCS] Adds Transform alerts docs (#78185) 2021-10-05 14:06:48 +02:00
James Rodewig
5c7fac77b3
[DOCS] Add Beats config example for ingest pipelines (#78633)
* [DOCS] Add Beats config example for ingest pipelines

The Elasticsearch ingest pipeline docs cover ingest pipelines for Fleet and
Elastic Agent. However, the docs don't cover Beats. This adds those docs.

Relates to https://github.com/elastic/beats/pull/28239.

* Update docs/reference/ingest.asciidoc

Co-authored-by: DeDe Morton <dede.morton@elastic.co>

Co-authored-by: DeDe Morton <dede.morton@elastic.co>
2021-10-05 05:47:50 -04:00
Alan Woodward
2de2bef4de
Remove indices_segments 'verbose' parameter (#78451)
The 'verbose' option to /_segments returns memory information
for each segment. However, lucene 9 has stopped tracking this memory
information as it is largely held off-heap and so is no longer significant.

This commit deprecates the 'verbose' parameter and makes it a no-op.

Fixes #75955
2021-10-05 09:17:16 +01:00
Ignacio Vera
920b3b52c2
Add support for metrics aggregations to mvt end point (#78614)
It adds support for several aggregations.
2021-10-05 09:17:25 +02:00
James Rodewig
fd30c6daf8
Add reference to PHP client on Bulk API page (#78558) (#78651)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: Christian Fratta <christian.fratta@gmail.com>
2021-10-04 17:42:42 -04:00
Joe Gallo
4a14f2f6f9
Validate that snapshot repository exists for ILM policies at creation/update time (#78468) 2021-10-04 15:19:10 -04:00
Benjamin Trent
7a7fffcb5a
[ML] Text/Log categorization multi-bucket aggregation (#71752)
This commit adds a new multi-bucket aggregation: `categorize_text`

The aggregation follows a similar design to significant text in that it reads from `_source`
and re-analyzes the the text as it is read. 

Key difference is that it does not use the indexed field's analyzer, but instead relies on 
the `ml_standard` tokenizer with specialized ML token filters. The tokenizer + filters are the
same that machine learning categorization anomaly jobs utilize.

The high level logical flow is as follows:
 - at each shard, read in the text field with a custom analyzer using `ml_standard` tokenizer
 - Read in the particular tokens from the analyzer
 - Feed these tokens to a token tree algorithm (an adaptation of the drain categorization algorithm)
 - Gather the individual log categories (the leaf nodes), sort them by doc_count, ship those buckets to be merged
 - Merge all buckets that have the EXACT same key
 - Once all buckets are merged, pass those keys + counts to a new token tree for additional merging
 - That tree builds the final buckets and that is returned to the user

Algorithm explanation:

 - Each log is parsed with the ml-standard tokenizer
 - each token is passed into a token tree
 - For `max_match_token` each token is stored in the tree and at `max_match_token+1` (or `len(tokens)`) a log group is created
 - If another log group exists at that leaf, merge it if they have `similarity_threshold` percentage of tokens in common
     - merging simply replaces tokens that are different in the group with `*`
 - If a layer in the tree has `max_unique_tokens` we add a `*` child and any new tokens are passed through there. Catch here is that on the final merge, we first attempt to merge together subtrees with the smallest number of documents. Especially if the new sub tree has more documents counted.

## Aggregation configuration.

Here is an example on some openstack logs
```js
POST openstack/_search?size=0
{
  "aggs": {
    "categories": {
      "categorize_text": {
        "field": "message", // The field to categorize
        "similarity_threshold": 20, // merge log groups if they are this similar
        "max_unique_tokens": 20, // Max Number of children per token position
        "max_match_token": 4, // Maximum tokens to build prefix trees
        "size": 1
      }
    }
  }
}
```

This will return buckets like
```json
"aggregations" : {
    "categories" : {
      "buckets" : [
        {
          "doc_count" : 806,
          "key" : "nova-api.log.1.2017-05-16_13 INFO nova.osapi_compute.wsgi.server * HTTP/1.1 status len time"
        }
      ]
    }
  }
```
2021-10-04 11:49:16 -04:00
Stef Nestor
e0cb0beb73
[DOCS] Fix SLM status response (#78584)
The get SLM status API will only return one of three statuses: `RUNNING`, `STOPPING`, or `STOPPED`.

This corrects the docs to remove the `STARTED` status and document the `RUNNING` status.

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-10-04 09:41:17 -04:00
Tanguy Leroux
63d663e220
Add periodic maintenance task to clean up unused blob store cache docs (#78438)
In #77686 we added a service to clean up blob store 
cache docs after a searchable snapshot is no more 
used. We noticed some situations where some cache 
docs could still remain in the system index: when the 
system index is not available when the searchable 
snapshot index is deleted; when the system index is 
restored from a backup or when the searchable 
snapshot index was deleted on a version before #77686.

This commit introduces a maintenance task that 
periodically scans and cleans up unused blob cache 
docs. This task is scheduled to run every hour on the 
data node that contain the blob store cache primary 
shard. The periodic task works by using a point in 
time context with search_after.
2021-10-04 13:15:56 +02:00
James Rodewig
9e0299f551
[DOCS] Troubleshoot the flood-stage watermark error (#78519)
Adds troubleshooting steps for the flood-stage watermark error.

Closes #77906.
2021-10-01 08:32:53 -04:00
Ignacio Vera
e4cde37111
Add centroid grid type in mvt request (#78305)
For this grid type, the features on the aggregation layer are represented by a point that is computed from the 
centroid of the data inside the cell

Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-10-01 06:56:13 +02:00
James Rodewig
c33e340a47
[DOCS] EQL: Document runs keyword (#78478) (#78518)
Documents the `runs` keyword for running the same event criteria successively in a sequence query.

Relates to #75082.

# Conflicts:
#	docs/reference/release-notes/highlights.asciidoc
2021-09-30 10:23:14 -04:00
Yannick Welsch
3dac76c190
Disk usage API does not support timeout parameters (#78503)
Fixes the documentation that the disk usage API is not supporting timeout parameters.

Closes #78356
2021-09-30 16:08:00 +02:00
James Rodewig
12019a89fd
[DOCS] Document archived settings (#78351)
Documents `archived.*` persistent cluster settings and index settings.
These settings are commonly produced during a major version upgrade.

Closes #28027
2021-09-30 09:27:53 -04:00
debadair
7431a9656e
[DOCS] Fix erroneous page break. (#78487) 2021-09-29 15:12:13 -07:00
William Brafford
8c2fe902f3
Feature upgrade rest stubs (#77827)
* Add stubs for get API
* Add stub for post API
* Register new actions in ActionModule
* HLRC stubs
* Unit tests
* Add rest api spec and tests
* Add new action to non-operator actions list
2021-09-29 16:25:15 -04:00
Jack Conradson
086ba1aefb
Remove JodaCompatibleZonedDateTime (#78417)
This change removes JodaCompatibleZonedDateTime and replaces it with ZonedDateTime for use in 
scripting.

Breaking changes:
* JodaCompatibleDateTime no longer exists and cannot be cast to in Painless. Use ZonedDateTime 
instead.
* The dayOfWeek method on ZonedDateTime returns the DayOfWeek enum instead of an int from 
JodaCompatibleDateTime. dayOfWeekEnum still exists on ZonedDateTime as an augmentation to 
support the transition to ZonedDateTime, but is now deprecated in favor of dayOfWeek on 
ZonedDateTime.
2021-09-29 13:01:40 -07:00
Benjamin Trent
498e6e3d0f
[ML] adding docs for estimated heap and operations (#78376)
Add docs for optionally supplying memory and operation estimates in put model
2021-09-29 09:11:42 -04:00
James Rodewig
4544ab2dbb
[DOCS] Always enable file and native realms unless explicitly disabled (#78405)
* [DOCS] Always enable file and native realms by default

Adds an 8.0 breaking change for PR #69096.

The copy is based on the 7.13 deprecation notice added with PR #69320.

* reword

* Update docs/reference/migration/migrate_8_0/security.asciidoc

Co-authored-by: Yang Wang <ywangd@gmail.com>

* Update docs/reference/migration/migrate_8_0/security.asciidoc

Co-authored-by: Yang Wang <ywangd@gmail.com>

Co-authored-by: Yang Wang <ywangd@gmail.com>
2021-09-29 09:10:30 -04:00
James Rodewig
f4b5ef7416
[DOCS] Remove include_type_name query parameter (#78394)
Adds an 8.0 breaking change for PR #48632.
2021-09-29 09:00:15 -04:00
Benjamin Trent
b96d929af3
[ML] add documentation for get deployment stats API (#78412)
* [ML] add documentation for get deployment stats API

* Apply suggestions from code review

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>

Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co>
2021-09-29 07:20:25 -04:00
David Turner
07a2acac93
Improve docs for pre-release version compatibility (#78428)
* Improve docs for pre-release version compatibility

Follow-up to #78317 clarifying a couple of points:

- a pre-release build can restore snapshots from released builds
- compatibility applies if at least one of the local or remote cluster
  is a released build

* Remote cluster build date nit
2021-09-29 04:49:07 -04:00
James Baiera
eafbd336c2
Remove Monitoring ingest pipelines (#77459)
Monitoring installs a number of ingest pipelines which have been historically used
to upgrade documents when mappings and document structures change between 
versions. Since there aren't any changes to the document format, nor will there be 
by the time the format is completely retired, we can comfortably remove these 
pipelines.
2021-09-28 16:10:02 -04:00
James Rodewig
58595e7af5
[DOCS] Searches on the _type field are no longer supported (#78400)
Adds an 8.0 breaking change for PR #68564
2021-09-28 14:51:45 -04:00
Benjamin Trent
408489310c
[ML] add zero_shot_classification task for BERT nlp models (#77799)
Zero-Shot classification allows for text classification tasks without a pre-trained collection of target labels.

This is achieved through models trained on the Multi-Genre Natural Language Inference (MNLI) dataset. This dataset pairs  text sequences with "entailment" clauses. An example could be:

"Throughout all of history, man kind has shown itself resourceful, yet astoundingly short-sighted" could have been paired with the entailment clauses: ["This example is history", "This example is sociology"...]. 

This training set combined with the attention and semantic knowledge in modern day NLP models (BERT, BART, etc.) affords a powerful tool for ad-hoc text classification.

See https://arxiv.org/abs/1909.00161 for a deeper explanation of the MNLI training and how zero-shot works. 

The zeroshot classification task is configured as follows:
```js
{
   // <snip> model configuration </snip>
  "inference_config" : {
    "zero_shot_classification": {
      "classification_labels": ["entailment", "neutral", "contradiction"], // <1>
      "labels": ["sad", "glad", "mad", "rad"], // <2>
      "multi_label": false, // <3>
      "hypothesis_template": "This example is {}.", // <4>
      "tokenization": { /*<snip> tokenization configuration </snip>*/}
    }
  }
}
```
* <1> For all zero_shot models, there returns 3 particular labels when classification the target sequence. "entailment" is the positive case, "neutral" the case where the sequence isn't positive or negative, and "contradiction" is the negative case
* <2> This is an optional parameter for the default zero_shot labels to attempt to classify
* <3> When returning the probabilities, should the results assume there is only one true label or multiple true labels
* <4> The hypothesis template when tokenizing the labels. When combining with `sad` the sequence looks like `This example is sad.`

For inference in a pipeline one may provide label updates:
```js
{
  //<snip> pipeline definition </snip>
  "processors": [
    //<snip> other processors </snip>
    {
      "inference": {
        // <snip> general configuration </snip>
        "inference_config": {
          "zero_shot_classification": {
             "labels": ["humanities", "science", "mathematics", "technology"], // <1>
             "multi_label": true // <2>
          }
        }
      }
    }
    //<snip> other processors </snip>
  ]
}
```
* <1> The `labels` we care about, these replace the default ones if they exist. 
* <2> Should the results allow multiple true labels

Similarly one may provide label changes against the `_infer` endpoint
```js
{
   "docs":[{ "text_field": "This is a very happy person"}],
   "inference_config":{"zero_shot_classification":{"labels": ["glad", "sad", "bad", "rad"], "multi_label": false}}
}
```
2021-09-28 09:38:23 -04:00