Commit graph

79 commits

Author SHA1 Message Date
Dimitris Athanasiou
4d2be9bd32
[ML] Add num_top_feature_importance_values param to regression and classi… (#50914)
Adds a new parameter to regression and classification that enables computation
of importance for the top most important features. The computation of the importance
is based on SHAP (SHapley Additive exPlanations) method.
2020-01-14 15:01:47 +02:00
Benjamin Trent
4cecb7a5be
[ML][Inference] PUT API (#50852)
This adds the `PUT` API for creating trained models that support our format. 

This includes

* HLRC change for the API
* API creation
* Validations of model format and call
2020-01-11 16:02:56 -05:00
Dimitris Athanasiou
af0ce426cc
[ML] Implement force deleting a data frame analytics job (#50553)
Adds a `force` parameter to the delete data frame analytics
request. When `force` is `true`, the action force-stops the
jobs and then proceeds to the deletion. This can be used in
order to delete a non-stopped job with a single request.

Closes #48124
2020-01-03 12:01:41 +02:00
Przemysław Witek
786ead630a
Implement precision and recall metrics for classification evaluation (#49671) 2019-12-19 16:07:09 +01:00
Dimitris Athanasiou
269425b54d
[ML] Introduce randomize_seed setting for regression and classification (#49990)
This adds a new `randomize_seed` for regression and classification.
When not explicitly set, the seed is randomly generated. One can
reuse the seed in a similar job in order to ensure the same docs
are picked for training.
2019-12-10 10:22:53 +02:00
Dimitris Athanasiou
bad07b76f7
[ML] Add optional source filtering during data frame reindexing (#49690)
This adds a `_source` setting under the `source` setting of a data
frame analytics config. The new `_source` is reusing the structure
of a `FetchSourceContext` like `analyzed_fields` does. Specifying
includes and excludes for source allows selecting which fields
will get reindexed and will be available in the destination index.

Closes #49531
2019-11-29 14:20:31 +02:00
Benjamin Trent
ba914453be
[ML][Inference][HLRC] add GET _stats (#49562) 2019-11-26 09:26:31 -05:00
Benjamin Trent
fc7df300a2
[ML][Inference][HLRC] Delete trained model API (#49567) 2019-11-26 07:13:02 -05:00
Dimitris Athanasiou
0390ec3627
[ML] Explain data frame analytics API (#49455)
This commit replaces the _estimate_memory_usage API with
a new API, the _explain API.

The API consolidates information that is useful before
creating a data frame analytics job.

It includes:

- memory estimation
- field selection explanation

Memory estimation is moved here from what was previously
calculated in the _estimate_memory_usage API.

Field selection is a new feature that explains to the user
whether each available field was selected to be included or
not in the analysis. In the case it was not included, it also
explains the reason why.
2019-11-22 20:08:14 +02:00
Benjamin Trent
9006926a15
[ML][Inference][HLRC] GET trained models (#49464) 2019-11-22 07:31:30 -05:00
Przemysław Witek
94ee36d61e
Implement accuracy metric for multiclass classification (#47772) 2019-11-21 13:07:14 +01:00
Przemysław Witek
99c912b79f
Make num_top_classes parameter's default value equal to 2 (#48119) 2019-10-17 17:59:22 +02:00
Przemysław Witek
9b5770da0e
Add MlClientDocumentationIT tests for classification. (#47569) 2019-10-11 08:21:45 +02:00
Dimitris Athanasiou
e99435a7f6
[ML] Additional outlier detection parameters (#47600)
Adds the following parameters to `outlier_detection`:

- `compute_feature_influence` (boolean): whether to compute or not
   feature influence scores
- `outlier_fraction` (double): the proportion of the data set assumed
   to be outlying prior to running outlier detection
- `standardization_enabled` (boolean): whether to apply standardization
   to the feature values
2019-10-07 15:28:21 +03:00
Lisa Cawley
b1bbed84eb
[DOCS] Fixes data frame analytics job terminology in HLRC (#46758) 2019-09-16 10:00:44 -07:00
Lisa Cawley
b3dfd6e6d0
[DOCS] Updates dataframe transform terminology (#46642) 2019-09-16 08:28:19 -07:00
Lisa Cawley
1e63105e30
[DOCS] Adds missing icons to ML HLRC APIs (#46515) 2019-09-10 08:26:56 -07:00
Dimitris Athanasiou
eab64250eb
[ML][HLRC] Add data frame analytics regression analysis (#46024) 2019-08-28 08:12:10 +03:00
Przemysław Witek
31f6e78acd
Allow the user to specify 'query' in Evaluate Data Frame request (#45775) 2019-08-22 08:27:38 +02:00
Dimitris Athanasiou
8af319481e
[ML] Add description to DF analytics (#45774) 2019-08-21 19:58:09 +03:00
Przemysław Witek
c6a25a818d
Add docs for HLRC for Estimate memory usage API (#45538) 2019-08-21 12:52:17 +02:00
Lisa Cawley
46912c8f3d
[DOCS] Reformats ML update APIs (#45253) 2019-08-06 11:05:01 -07:00
Lisa Cawley
285f2e0625
[DOCS] Updates terms in machine learning get APIs (#44986) 2019-07-30 10:52:23 -07:00
Lisa Cawley
3f31859669
[DOCS] Updates terms in machine learning datafeed APIs (#44883) 2019-07-26 10:47:03 -07:00
Lisa Cawley
aefb72040c
[DOCS] Updates terms in machine learning calendar APIs (#44866) 2019-07-25 11:20:42 -07:00
Lisa Cawley
9b16486615
[DOCS] Minor edits to HLRC ML APIs (#44865) 2019-07-25 10:00:06 -07:00
Lisa Cawley
990e037728
[DOCS] Updates terms in anomaly detection job APIs (#44839) 2019-07-25 08:58:16 -07:00
Przemysław Witek
2ca70f788f
Deprecate the ability to update datafeed's job_id. (#44691) 2019-07-23 12:39:22 +02:00
Dimitris Athanasiou
d6f36a8e4f
[ML] Set df-analytics task state to failed when appropriate (#43880)
This introduces a `failed` state to which the data frame analytics
persistent task is set to when something unexpected fails. It could
be the process crashing, the results processor hitting some error,
etc. The failure message is then captured and set on the task state.
From there, it becomes available via the _stats API as `failure_reason`.

The df-analytics stop API now has a `force` boolean parameter. This allows
the user to call it for a failed task in order to reset it to `stopped` after
we have ensured the failure has been communicated to the user.

This commit also adds the analytics version in the persistent task
params as this allows us to prevent tasks to run on unsuitable nodes in
the future.
2019-07-03 10:59:52 +03:00
Dimitris Athanasiou
5fa36dad0b
[ML] Machine learning data frame analytics (#43544)
This merges the initial work that adds a framework for performing
machine learning analytics on data frames. The feature is currently experimental
and requires a platinum license. Note that the original commits can be
found in the `feature-ml-data-frame-analytics` branch.

A new set of APIs is added which allows the creation of data frame analytics
jobs. Configuration allows specifying different types of analysis to be performed
on a data frame. At first there is support for outlier detection.

The APIs are:

- PUT _ml/data_frame/analysis/{id}
- GET _ml/data_frame/analysis/{id}
- GET _ml/data_frame/analysis/{id}/_stats
- POST _ml/data_frame/analysis/{id}/_start
- POST _ml/data_frame/analysis/{id}/_stop
- DELETE _ml/data_frame/analysis/{id}

When a data frame analytics job is started a persistent task is created and started.
The main steps of the task are:

1. reindex the source index into the dest index
2. analyze the data through the data_frame_analyzer c++ process
3. merge the results of the process back into the destination index

In addition, an evaluation API is added which packages commonly used metrics
that provide evaluation of various analysis:

- POST _ml/data_frame/_evaluate
2019-06-25 10:48:27 +03:00
Benjamin Trent
8280a20664
ML: Add upgrade mode docs, hlrc, and fix bug (#37942)
* ML: Add upgrade mode docs, hlrc, and fix bug

* [DOCS] Fixes build error and edits text

* adjusting docs

* Update docs/reference/ml/apis/set-upgrade-mode.asciidoc

Co-Authored-By: benwtrent <ben.w.trent@gmail.com>

* Update set-upgrade-mode.asciidoc

* Update set-upgrade-mode.asciidoc
2019-01-30 06:51:11 -06:00
Vladimir Dolzhenko
f0c5f0c099 [HLRC] XPack ML info action (#35777)
Relates to #29827
2018-11-28 10:58:20 +00:00
Ed Savage
13e11966ca
[HLRC][ML] Add delete expired data API (#35906)
Relates to #29827
2018-11-26 16:15:54 +00:00
David Roberts
3c059ee057
[HLRC][ML] Add ML find file structure API (#35833)
Relates to #29827
2018-11-23 06:58:05 +00:00
Benjamin Trent
90a8e4b259
HLRC: ML Delete event from Calendar (#35760)
* HLRC: Delete event from calendar

* adjusting tests

* adjusting code to make it more readable
2018-11-21 16:22:04 -06:00
Ed Savage
4f857c4f8d
[HLRC][ML] Add ML revert model snapshot API (#35750)
Relates to #29827
2018-11-21 09:10:37 +00:00
Benjamin Trent
84db1e42c0
HLRC: ML Get Calendar Events (#35747)
* HLRC: ML Get Calendar Events

* Addressing PR comments
2018-11-20 16:40:31 -06:00
Benjamin Trent
7657e6d274
HLRC ML Add Event To Calendar API (#35704)
* HLRC: ML Adding Post event to calendar api

* Fixing tests and serialization

* removing unused import
2018-11-20 08:15:21 -06:00
Benjamin Trent
d707838c02
HLRC: ML Delete job from calendar (#35713) 2018-11-20 07:43:34 -06:00
Ed Savage
844483a99a
[HLRC][ML] Add ML update model snapshot API (#35537) (#35694)
Relates to #29827
2018-11-20 12:18:29 +00:00
Benjamin Trent
214bc96738
HLRC: ML Add Job to Calendar API (#35666) 2018-11-19 11:41:49 -06:00
Benjamin Trent
bc7dea4480
ML: changing automatic check_window calculation (#35643)
* ML: changing automatic check_window calculation

* adding docs on how we calculate the default
2018-11-19 08:03:34 -06:00
Ignacio Vera
ae6a33237f
HLRC: Add ML delete filter action (#35382)
* HLRC: Add ML delete filter action

It adds delete ML filter action to the high level rest client.

Relates #29827
2018-11-19 11:25:35 +01:00
Benjamin Trent
f7ada9b29b
Add delayed datacheck to the datafeed job runner (#35387)
* ML: Adding missing datacheck to datafeedjob

* Adding client side and docs

* Making adjustments to validations

* Making values default to on, having more sensible limits

* Intermittent commit, still need to figure out interval

* Adjusting delayed data check interval

* updating docs

* Making parameter Boolean, so it is nullable

* bumping bwc to 7 before backport

* changing to version current

* moving delayed data check config its own object

* Separation of duties for delayed data detection

* fixing checkstyles

* fixing checkstyles

* Adjusting default behavior so that null windows are allowed

* Mentioning the default value

* Fixing comments, syncing up validations
2018-11-15 13:32:45 -06:00
Ed Savage
2d948a001e
[HLRC][ML] Add ML delete model snapshot API (#35537)
Relates to #29827
2018-11-15 14:57:17 +00:00
Benjamin Trent
803eccec11
HLRC: Adding ML Update Filter API (#35522)
* HLRC: Adding ml get filters api

* HLRC: Adding ML Update Filter API
2018-11-14 11:13:11 -06:00
Ed Savage
e7b7d52a6a
[HLRC][ML] Add ML get model snapshots API (#35487)
Relates #29827
2018-11-14 13:03:04 +00:00
Benjamin Trent
b9eb5f7b63
HLRC: Adding ml get filters api (#35502)
* HLRC: Adding ml get filters api

* refactoring setId name
2018-11-13 14:53:32 -06:00
Benjamin Trent
a4442dacd7
HLRC: Add ML API PUT filter (#35175) 2018-11-05 08:56:53 -06:00
Benjamin Trent
052dfa5646
HLRC: Adding Update datafeed API (#34882)
* HLRC: Adding Update datafeed API

* Addressing unused import

* Adjusting docs and fixing minor comments

* fixing comment
2018-10-26 16:44:12 -05:00