Commit graph

522 commits

Author SHA1 Message Date
István Zoltán Szabó
70a012b0c7
[DOCS] Fixes section IDs in start/stop trained model deployment APIs. (#77247) 2021-09-03 14:24:37 +02:00
Lisa Cawley
007469af63
[DOCS] Replaces index pattern in ML docs (#77041) 2021-09-01 10:26:06 -07:00
Benjamin Trent
0e1efa6533
[ML] generalize pytorch sentiment analysis to text classification (#77084)
* [ML] generalize pytorch sentiment analysis to text classification

* Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/inference/trainedmodel/TextClassificationConfig.java
2021-09-01 08:45:13 -04:00
István Zoltán Szabó
ea007902ef
[DOCS] Adds anomaly job health alert type docs (#76659)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2021-08-30 16:11:34 +02:00
Lisa Cawley
d36f24fbc3
[DOCS] Update datafeed details in ML docs (#76854) 2021-08-25 11:35:21 -07:00
István Zoltán Szabó
789368b38f
[DOCS] Fixes a syntax error in datafeed runtime field example. (#76917) 2021-08-25 12:04:32 +02:00
István Zoltán Szabó
8aed99fc02
[DOCS] Adds links that point to loss function to ML API docs. (#76438) 2021-08-23 13:09:37 +02:00
István Zoltán Szabó
7faec52a1e
[DOCS] Fixes model_prune_window property description. (#76711) 2021-08-19 16:16:37 +02:00
István Zoltán Szabó
b9d875bf68
[DOCS] Updates description of model_prune_window property in ML shared (#76487) 2021-08-13 12:18:38 +02:00
István Zoltán Szabó
9b0417f2df
[DOCS] Comments out links that points to regression loss functions (#76435)
* [DOCS] Comments out links that points to regression loss functions.

* Update docs/reference/ml/df-analytics/apis/get-trained-models.asciidoc
2021-08-12 18:33:42 +02:00
David Roberts
7ac5ea39df
[ML] Use results retention time for deleting system annotations (#76096)
In #75617 a new setting, system_annotations_retention_days, was
added to control how long system annotations are retained for.
We now feel that this setting is redundant and that system
annotations should be retained for the same period as results.
This is intuitive and defensible, as system annotations can be
considered a type of result.

Followup to #75617
2021-08-04 17:42:31 +01:00
David Roberts
10a1d27c7b
[ML] Deleting a job now deletes the datafeed if necessary (#76010)
Previously attempting to delete a job that had a datafeed
would return an exception. However, this was unnecessarily
pedantic - the user would always want to delete both job
and datafeed together, and would react by deleting the
datafeed and then subsequently deleting the job again.

This change makes the delete job API automatically delete
a datafeed associated with the job. The same level of
force is used for this delete datafeed request as was used
on the delete job request. This means that it's possible
to force-delete an open job with a started datafeed (since
force-delete datafeed will automatically stop a started
datafeed). It's still not possible to delete an opened job
without using force.
2021-08-03 17:22:06 +01:00
James Rodewig
fc0ac1923d
[DOCS] Correct spelling for geo terms (#76028)
Changes:
* Use "geopoint" when not referring to the literal field type
* Use "geoshape" when not referring to the literal field type or query type
* Use "GeoJSON" consistently
2021-08-03 09:55:48 -04:00
Ed Savage
5651215be1
[ML] Add 'model_prune_window' field to AD job config (#75741)
Add configuration for pruning dead split fields in anomaly detection
jobs via the `model_prune_window` field for both the job creation and
update APIs.

Relates to ml-cpp/#1962
2021-08-03 09:16:43 +01:00
István Zoltán Szabó
ce537a33b6
[DOCS] Adds link that points to outlier detection example to GET DFA stats API docs. (#75689) 2021-08-02 18:10:03 +02:00
István Zoltán Szabó
8d4fb3aa84
[DOCS] Changes link to outlier detection docs in PUTDFA API docs. (#75933) 2021-08-02 13:45:37 +02:00
Przemysław Witek
30d9f13436
[ML] Delete expired annotations (#75617) 2021-07-29 15:27:03 +02:00
Lisa Cawley
c1ba949aee
[DOCS] Fixes bulleted list in ML aggregations (#75806) 2021-07-28 11:29:48 -07:00
Lisa Cawley
02d851e50e
[DOCS] Drafts trained model deployment APIs (#75497) 2021-07-26 09:49:37 -07:00
István Zoltán Szabó
7e7a386078
[DOCS] Comments out link that points to outlier detection example (#75687) 2021-07-26 16:36:57 +02:00
Lisa Cawley
70b870ee7f
[DOCS] Fixes nesting of datafeed config in APIs (#75502) 2021-07-20 11:27:15 -07:00
István Zoltán Szabó
9ef156df9f
[DOCS] Adds peak_model_bytes and assignment_memory_basis to GET model snapshot API docs (#75413) 2021-07-16 17:12:47 +02:00
Lisa Cawley
c8c7f0ef52
[DOCS] Anomaly detection: Visualize delayed data (#75098) 2021-07-13 18:06:07 -07:00
Lisa Cawley
3c76bcb3a5
[DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
István Zoltán Szabó
6a4de77e11
[DOCS] Adds classification and regression links back to DFA docs. (#74930) 2021-07-08 16:37:16 +02:00
István Zoltán Szabó
841cfb9214
[DOCS] Adds outlier detection links to DFA API docs (#74748) 2021-07-06 15:10:41 +02:00
Lisa Cawley
b71b7d0866
[DOCS] Fix links to anomaly detection overview (#74943) 2021-07-05 13:19:54 -07:00
Lisa Cawley
4c85852cc7
[DOCS] Update forecasting links in ML APIs (#74942) 2021-07-05 12:34:03 -07:00
Lisa Cawley
5bcd318e29
[DOCS] Move ML functions to appendix (#74802) 2021-07-05 11:53:17 -07:00
István Zoltán Szabó
483d145f78
[DOCS] Fixes an attribute in PUT DFA API docs. (#74931) 2021-07-05 17:08:11 +02:00
István Zoltán Szabó
6c6e6874ff
[DOCS] Removes link to classification and regression. (#74926) 2021-07-05 16:28:14 +02:00
István Zoltán Szabó
a4f9f4fae1
[DOCS] Comments out links to outlier detection. (#74745) 2021-06-30 14:24:34 +02:00
Lisa Cawley
64af39b759
[DOCS] Add memory limit details in update job API (#74517)
Co-authored-by: David Roberts <dave.roberts@elastic.co>
2021-06-24 08:50:19 -07:00
Benjamin Trent
0303e6d733
[ML] add datafeed field to the job config (#74265)
This is a quality of life improvement for typical users. Almost all anomaly jobs will receive their data through a datafeed.

The datafeed config can now be supplied and is available in the datafeed field in the job config for creation and getting jobs.
2021-06-23 08:06:58 -04:00
David Roberts
6e9b959450
[ML] Closing an anomaly detection job now automatically stops its datafeed if necessary (#74257)
Previously it was a requirement of the close job API that if the
job had an associated datafeed that that datafeed was stopped
before the job could be closed. Experience has shown that this
is just a pedantic nuisance. If a user closes the job without
first stopping the datafeed then it's just a mistake, and they
then have to make two further calls, to stop the datafeed and
then attempt to close the job again.

This PR changes the behaviour so that if you ask to close a job
whose datafeed is running then the datafeed gets stopped first
as part of the same call. Datafeeds are stopped with the same
level of force as the job close request specified.
2021-06-22 12:56:11 +01:00
István Zoltán Szabó
2e820fcab6
[DOCS] Clarifies terminology in Performing population analysis page. (#74237) 2021-06-18 09:03:38 +02:00
ymao1
c727b40d0b
[Docs] Update cross-document links to Kibana Alerting docs (#74034)
* Updating cross-document links

* PR fixes
2021-06-14 12:23:47 -04:00
Dimitris Athanasiou
dc61a72c9e
[ML] Reset anomaly detection job API (#73908)
Adds a new API that allows a user to reset
an anomaly detection job.

To use the API do:

```
POST _ml/anomaly_detectors/<job_id>_reset
```

The API removes all data associated to the job.
In particular, it deletes model state, results and stats.

However, job notifications and user annotations are not removed.

Also, the API can be called asynchronously by setting the parameter
`wait_for_completion` to `false` (defaults to `true`). When run
that way the API returns the task id for further monitoring.

In order to prevent the job from opening while it is resetting,
a new job field has been added called `blocked`. It is an object
that contains a `reason` and the `task_id`. `reason` can take
a value from ["delete", "reset", "revert"] as all these
operations should block the job from opening. The `task_id` is also
included in order to allow tracking the task if necessary.

Finally, this commit also sets the `blocked` field when
the revert snapshot API is called as a job should not be opened
while it is reverted to a different model snapshot.
2021-06-14 18:56:28 +03:00
Benjamin Trent
8d882863d7
[ML] adding running_state to datafeed stats object (#73926)
It is useful to know the following information when reading datafeed stats:

 - Is the datafeed a "real-time" datafeed, i.e. a datafeed without a configured `end` time
 - Has the datafeed processed all past data available at the time of starting.

This object is only available if the datafeed task has been created.

It has the form:

```
"running_state": {
  "is_real_time": <boolean>,
  "look_back_finished": <boolean>
}
```
2021-06-10 08:08:49 -04:00
István Zoltán Szabó
20d0dc300f
[DOCS] Updates datafeed related runtime field examples (#73725) 2021-06-08 11:27:55 +02:00
Lisa Cawley
a6339918ac
[DOCS] Adds defaults to get ML results APIs (#73540)
Co-authored-by: David Roberts <dave.roberts@elastic.co>
2021-06-03 10:05:47 -07:00
István Zoltán Szabó
44c26c8bdc
[DOCS] Removes Kibana charts-related advise about agg interval and bucket span. (#73673) 2021-06-02 16:47:01 +02:00
David Roberts
0059c59e25
[ML] Make ml_standard tokenizer the default for new categorization jobs (#72805)
Categorization jobs created once the entire cluster is upgraded to
version 7.14 or higher will default to using the new ml_standard
tokenizer rather than the previous default of the ml_classic
tokenizer, and will incorporate the new first_non_blank_line char
filter so that categorization is based purely on the first non-blank
line of each message.

The difference between the ml_classic and ml_standard tokenizers
is that ml_classic splits on slashes and colons, so creates multiple
tokens from URLs and filesystem paths, whereas ml_standard attempts
to keep URLs, email addresses and filesystem paths as single tokens.

It is still possible to config the ml_classic tokenizer if you
prefer: just provide a categorization_analyzer within your
analysis_config and whichever tokenizer you choose (which could be
ml_classic or any other Elasticsearch tokenizer) will be used.

To opt out of using first_non_blank_line as a default char filter,
you must explicitly specify a categorization_analyzer that does not
include it.

If no categorization_analyzer is specified but categorization_filters
are specified then the categorization filters are converted to char
filters applied that are applied after first_non_blank_line.

Closes elastic/ml-cpp#1724
2021-06-01 15:11:32 +01:00
István Zoltán Szabó
1ce2308e2a
[DOCS] Adds max_trees hyperparameter to GET TM API docs (#72298) 2021-05-06 08:18:19 +02:00
István Zoltán Szabó
d07c174aaf
[DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
Benjamin Trent
2ce4d175f0
[ML] increase the default value of xpack.ml.max_open_jobs from 20 to 512 for autoscaling improvements (#72487)
This commit increases the xpack.ml.max_open_jobs from 20 to 512. Additionally, it ignores nodes that cannot provide an accurate view into their native memory.

If a node does not have a view into its native memory, we ignore it for assignment.

This effectively fixes a bug with autoscaling. Autoscaling relies on jobs with adequate memory to assign jobs to nodes. If that is hampered by the xpack.ml.max_open_jobs scaling decisions are hampered.
2021-04-30 07:55:57 -04:00
István Zoltán Szabó
ce9dd74cf5
[DOCS] Expands DFA and TM API docs with required privileges info (#71335) 2021-04-28 08:33:42 +02:00
Pierre Grimaud
3c44dfec60
[DOCS] Fix typos (#72227) 2021-04-26 12:40:38 -04:00
István Zoltán Szabó
2f122f03b2
[DOCS] Adds anomaly detection rule advanced settings to docs (#72072)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2021-04-26 09:55:02 +02:00
István Zoltán Szabó
aca0a7ffa4
[DOCS] Alters examples in anomaly detection page to use runtime mappings (#71745) 2021-04-19 13:06:50 +02:00