Commit graph

190 commits

Author SHA1 Message Date
István Zoltán Szabó
9a8c6fb66f
[DOCS] Removes beta labels from DFA related docs. (#70808) 2021-03-26 09:46:41 +01:00
Lisa Cawley
2caba7b11f
[DOCS] Edits machine learning settings (#69947)
Co-authored-by: David Roberts <dave.roberts@elastic.co>
2021-03-09 10:59:12 -08:00
Lisa Cawley
c537e5f38c
[DOCS] Edits delete trained model alias API (#70119) 2021-03-08 17:08:58 -08:00
István Zoltán Szabó
2ccc81081f
[DOCS] Adds hyperparameters option to the include setting of GET trained models API. (#69959) 2021-03-04 16:43:06 +01:00
Benjamin Trent
2279cafb4e
[ML] adding new _preview endpoint for data frame analytics (#69453)
This commit adds a new `_preview` endpoint for data frame analytics. 

This allows users to see the data on which their model will be trained. This is especially useful 
in the arrival of custom feature processors.

The API design is a similar to datafeed `_preview` and data frame analytics `_explain`.
2021-03-01 12:25:50 -05:00
Lisa Cawley
138224b398
[DOCS] Edits trained model alias API (#69491) 2021-02-24 08:17:49 -08:00
Dimitris Athanasiou
7fb98c0d3c
[ML] Add runtime mappings to data frame analytics source config (#69183)
Users can now specify runtime mappings as part of the source config
of a data frame analytics job. Those runtime mappings become part of
the mapping of the destination index. This ensures the fields are
accessible in the destination index even if the relevant data frame
analytics job gets deleted.

Closes #65056
2021-02-19 16:29:19 +02:00
Benjamin Trent
0af38bba9e
[ML] add new delete trained model aliases API (#69195)
In addition to creating and re-assigning model aliases, users should be able to delete existing and unused model aliases.
2021-02-18 13:12:07 -05:00
Benjamin Trent
26eef892df
[ML] adds new trained model alias API to simplify trained model updates and deployments (#68922)
A `model_alias` allows trained models to be referred by a user defined moniker. 

This not only improves the readability and simplicity of numerous API calls, but it allows for simpler deployment and upgrade procedures for trained models. 

Previously, if you referenced a model ID directly within an ingest pipeline, when you have a new model that performs better than an earlier referenced model, you have to update the pipeline itself. If this model was used in numerous pipelines, ALL those pipelines would have to be updated. 

When using a `model_alias` in an ingest pipeline, only that `model_alias` needs to be updated. Then, the underlying referenced model will change in place for all ingest pipelines automatically. 

An additional benefit is that the model referenced is not changed until it is fully loaded into cache, this way throughput is not hampered by changing models.
2021-02-18 09:41:50 -05:00
Lisa Cawley
a1fb2c3606
[DOCS] Fixes n_gram_encoding in data frame analytics APIs (#69084) 2021-02-16 14:02:00 -08:00
Valeriy Khakhutskyy
78368428b3
[ML] Add early stopping DFA configuration parameter (#68099)
The PR adds early_stopping_enabled optional data frame analysis configuration parameter. The enhancement was already described in elastic/ml-cpp#1676 and so I mark it here as non-issue.
2021-02-01 11:41:28 +01:00
Dimitris Athanasiou
5c961c1c81
[ML] Expand regression/classification hyperparameters (#67950)
Expands data frame analytics regression and classification
analyses with the followin hyperparameters:

- alpha
- downsample_factor
- eta_growth_rate_per_tree
- max_optimization_rounds_per_hyperparameter
- soft_tree_depth_limit
- soft_tree_depth_tolerance
2021-01-26 12:56:41 +02:00
István Zoltán Szabó
addb5cbd3a
[DOCS] Adds custom feature processors description to PUT DFA API (#67424)
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2021-01-19 09:47:32 +01:00
Dimitris Athanasiou
7574013604
[ML] Remove DFA job states reindexing and analyzing from docs (#67658)
These states do no longer exist as of #67423
2021-01-18 17:39:22 +02:00
Benjamin Trent
35f478b618
[ML] [DOCS] adding missing fields to the get trained models API docs (#67590)
Adds missing fields description, inference_config, and input to the GET trained models API documentation
2021-01-15 13:20:53 -05:00
István Zoltán Szabó
085a288af5
[DOCS] Adds hyperparameter metadata property to GET trained models API docs. (#67412) 2021-01-13 13:49:51 +01:00
István Zoltán Szabó
d3ad9fe632
[DOCS] Improves inference processor linking and docs (#66119) 2021-01-05 09:42:06 +01:00
Lisa Cawley
919c79b745
[DOCS] Add custom feature processor example (#64681) 2020-11-06 09:24:01 -08:00
James Rodewig
1ea83359bb
[DOCS] Fix case for 'Boolean' (#64299) 2020-10-29 09:04:43 -04:00
István Zoltán Szabó
6093518f4a
[DOCS] Changes experimental flag to beta in DFA related docs (#63992) 2020-10-26 17:02:46 +01:00
Lisa Cawley
a00c7a2b6c
[DOCS] Add tips for num_top_classes classification parameter (#63781) 2020-10-21 09:27:13 -07:00
István Zoltán Szabó
9defe10616
[DOCS] Expands DFA evaluation API docs with the default set of metrics (#63971) 2020-10-21 14:30:33 +02:00
Benjamin Trent
c1de07fa83
[ML] adding new flag exclude_generated that removes generated fields in GET config APIs (#63899)
When exporting and cloning ml configurations in a cluster it can be
frustrating to remove all the fields that were generated by
the plugin. Especially as the number of these fields change
from version to version.

This flag, exclude_generated, allows the GET config APIs to return
configurations with these generated fields removed.

APIs supporting this flag: 
- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

relates to #63055
2020-10-20 11:28:29 -04:00
Dimitris Athanasiou
03ed7de6c1
[ML] Rename evaluation metric result fields to value (#63809)
Renames data frame analytics _evaluate API results as follows:

  - per class accuracy renamed from `accuracy` to `value`
  - per class precision renamed from `precision` to `value`
  - per class recall renamed from `recall` to `value`
  - auc_roc `score` renamed to `value` for both outlier detection and classification
2020-10-20 10:30:50 +03:00
Przemysław Witek
d9e7d88f08
[ML] Allow setting num_top_classes to a special value -1 (#63587) 2020-10-13 13:14:17 +02:00
István Zoltán Szabó
e8930a44a4
[DOCS] Adds AUC ROC classification metric to the API examples (#63563) 2020-10-13 11:03:20 +02:00
István Zoltán Szabó
b517d4d9b5
[DOCS] Adds huber and msle metrics to Evaluate API example calls (#63414) 2020-10-08 17:05:04 +02:00
Przemysław Witek
b0019bd0a6
[ML] Validate that AucRoc has the data necessary to be calculated (#63302) 2020-10-08 08:19:43 +02:00
lcawl
2177b46289 [DOCS] Fixes typo 2020-10-06 09:19:43 -07:00
Lisa Cawley
49ab8f8688
[DOCS] Add feature_importance_baseline to get trained model API (#63279)
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2020-10-06 07:56:55 -07:00
István Zoltán Szabó
de3ce8bc39
[DOCS] Adds delta and offset parameters to Evaluate DFA API docs (#63317) 2020-10-06 16:06:35 +02:00
Lisa Cawley
51f9bf657d
[DOCS] Fix titles for ML APIs (#63152) 2020-10-02 11:53:49 -07:00
István Zoltán Szabó
baffdd1ec0
[DOCS] Updates trained models API docs titles. (#63165) 2020-10-02 10:15:14 -07:00
Benjamin Trent
7bd6e78dae
[ML] adding for_export flag for ml plugin GET resource APIs (#63092)
This adds the new `for_export` flag to the following APIs:

- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The flag is designed for cloning or exporting configuration objects to later be put into the same cluster or a separate cluster. 

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that would effectively require changing to be of use (e.g. datafeed job_id)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)


closes https://github.com/elastic/elasticsearch/issues/63055
2020-10-02 08:29:19 -04:00
Benjamin Trent
1084aaf18a
[ML] renames */inference* apis to */trained_models* (#63097)
This commit renames all `inference` CRUD APIs to `trained_models`.

This aligns with internal terminology, documentation, and use-cases.
2020-10-01 12:13:49 -04:00
Przemysław Witek
cd1a27f273
[ML] Implement AucRoc metric for classification (#60502) 2020-09-30 08:56:23 +02:00
Lisa Cawley
e48eab95e9
[DOCS] Formatting fix in get trained model API (#62643) 2020-09-21 08:19:37 -07:00
Benjamin Trent
fdb7b6d3b5
[ML] Add new include flag to GET inference/<model_id> API for model training metadata (#61922)
Adds new flag include to the get trained models API
The flag initially has two valid values: definition, total_feature_importance.
Consequently, the old include_model_definition flag is now deprecated.
When total_feature_importance is included, the total_feature_importance field is included in the model metadata object.
Including definition is the same as previously setting include_model_definition=true.
2020-09-18 07:11:38 -04:00
Lisa Cawley
e743ed6102
[DOCS] Minor typo in ML API (#62414) 2020-09-15 13:19:17 -07:00
Lisa Cawley
1e6cdcac20
[DOCS] Fix from and size descriptions for model APIs (#62128) 2020-09-08 12:54:51 -07:00
Lisa Cawley
4a7492f3fd
[DOCS] Fix allow_no_match description for model APIs (#62008) 2020-09-08 08:11:33 -07:00
István Zoltán Szabó
a75094e666
[DOCS] Removes inference from the names of trained model APIs. (#62036) 2020-09-07 11:23:29 +02:00
Benjamin Trent
1b34c88d56
[ML] adding docs + hlrc for data frame analysis feature_processors (#61149)
Adds HLRC and some docs for the new feature_processors field in Data frame analytics.

Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-08-24 12:00:44 -04:00
James Rodewig
a94e5cb7c4
[DOCS] Replace Wikipedia links with attribute (#61171) 2020-08-17 09:44:24 -04:00
István Zoltán Szabó
c3536935b2
[DOCS] Adds inference phase to get DFA job stats. (#60737) 2020-08-05 16:22:21 +02:00
Lisa Cawley
1781d4a7b9
[DOCS] Fix security links in machine learning APIs (#60098) 2020-07-23 12:14:56 -07:00
James Rodewig
80b674fb25
[DOCS] Reformat snippets to use two-space indents (#59973) 2020-07-21 12:24:26 -04:00
Przemysław Witek
2a12dcf2e0
Rename binary_soft_classification evaluation to outlier_detection (#59951) 2020-07-21 14:27:57 +02:00
Lisa Cawley
42be287b57
[DOCS] Changes level offset in data frame analytics APIs (#59919) 2020-07-20 12:11:47 -07:00
Benjamin Trent
b551f75ec3
[ML] add new custom field to trained model processors (#59542)
This commit adds the new configurable field `custom`.

`custom` indicates if the preprocessor was submitted by a user or automatically created by the analytics job.

Eventually, this field will be used in calculating feature importance. When `custom` is true, the feature importance for 
the processed fields is calculated. When `false` the current behavior is the same (we calculate the importance for the originating field/feature).

This also adds new required methods to the preprocessor interface. If users are to supply their own preprocessors 
in the analytics job configuration, we need to know the input and output field names.
2020-07-16 09:35:56 -04:00