Commit graph

522 commits

Author SHA1 Message Date
István Zoltán Szabó
bb44726ad6
[DOCS] Reworks some parts of EMM API docs (#54872)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-04-08 09:50:12 +02:00
Lisa Cawley
c355fea8f4
[DOCS] Remove text fields from classification dependent variables (#54849) 2020-04-07 10:43:15 -07:00
István Zoltán Szabó
1ae8bde756
[DOCS] Changes kibana_user to kibana_admin in DFA API prerequisites. (#54806) 2020-04-06 15:45:08 +02:00
István Zoltán Szabó
a0662399c7
[DOCS] Makes PUT inference API docs collapsible (#54653)
Co-authored-by: lcawl <lcawley@elastic.co>
2020-04-03 09:45:42 +02:00
Benjamin Trent
4e1ff31c3c
[ML] add new inference_config field to trained model config (#54421)
A new field called `inference_config` is now added to the trained model config object. This new field allows for default inference settings from analytics or some external model builder. 

The inference processor can still override whatever is set as the default in the trained model config.
2020-04-02 10:34:17 -04:00
Benjamin Trent
1d24960ff8
[ML] prefer secondary authorization header for data[feed|frame] authz (#54121)
Secondary authorization headers are to be used to facilitate Kibana spaces support + ML jobs/datafeeds. 

Now on PUT/Update/Preview datafeed, and PUT data frame analytics the secondary authorization is preferred over the primary (if provided).

closes https://github.com/elastic/elasticsearch/issues/53801
2020-04-02 10:10:46 -04:00
Benjamin Trent
bbd6e943de
[ML] add num_matches and preferred_to_categories to category defintion objects (#54214)
This adds two new fields to category definitions.

- `num_matches` indicating how many documents have been seen by this category
- `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized.

These fields are only guaranteed to be up to date after a `_flush` or `_close`

native change: https://github.com/elastic/ml-cpp/pull/1062
2020-04-02 07:49:09 -04:00
István Zoltán Szabó
b0f6d4ee0e
[DOCS] Updates estimate model memory docs (#54574) 2020-04-01 15:53:53 +02:00
István Zoltán Szabó
b96743cfc5
[DOCS] Adds data_counts object to the GET DFA stats API (#54498) 2020-04-01 10:05:00 +02:00
Jason Tedor
95a7eed9aa
Rename MetaData to Metadata in all of the places (#54519)
This is a simple naming change PR, to fix the fact that "metadata" is a
single English word, and for too long we have not followed general
naming conventions for it. We are also not consistent about it, for
example, METADATA instead of META_DATA if we were trying to be
consistent with MetaData (although METADATA is correct when considered
in the context of "metadata"). This was a simple find and replace across
the code base, only taking a few minutes to fix this naming issue
forever.
2020-03-31 15:52:01 -04:00
Lisa Cawley
b90e491f68
[DOCS] Collapses nested objects in data frame analytics APIs (#54472) 2020-03-31 10:56:48 -07:00
Dimitris Athanasiou
5a98fc20e1
[ML] Fix DF analytics explain API request in docs (#54510)
The explain API expects a data frame analytics config
as its request.
2020-03-31 18:37:19 +03:00
István Zoltán Szabó
85d9b34dc5
[DOCS] Adds description of analysis_stats object and its properties to GET DFA stats API docs (#53881)
Co-authored-by: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-03-31 13:27:54 +02:00
Lisa Cawley
fdcd19483d
[DOCS] Collapses content in machine learning APIs (#54234) 2020-03-30 10:08:38 -07:00
Jason Tedor
1fc0432b24
Introduce formal role for remote cluster client (#53924)
This commit introduce a formal role for identifying nodes that are
capable of making connections to remote clusters.
2020-03-24 19:21:56 -04:00
David Roberts
8ee770560a
[ML] Add a model memory estimation endpoint for anomaly detection (#53507)
A new endpoint for estimating anomaly detection job
model memory requirements:

POST _ml/anomaly_detectors/estimate_model_memory

Closes #53219
2020-03-24 21:38:19 +00:00
David Roberts
cbe063a074
[ML] Introduce a "starting" datafeed state for lazy jobs (#53918)
It is possible for ML jobs to open lazily if the "allow_lazy_open"
option in the job config is set to true.  Such jobs wait in the
"opening" state until a node has sufficient capacity to run them.

This commit fixes the bug that prevented datafeeds for jobs lazily
waiting assignment from being started.  The state of such datafeeds
is "starting", and they can be stopped by the stop datafeed API
while in this state with or without force.

Fixes #53763
2020-03-24 10:01:13 +00:00
István Zoltán Szabó
8279f82dea
[DOCS] Fixes typo in start datafeed API docs. (#53811) 2020-03-19 17:55:26 +01:00
István Zoltán Szabó
57321124ea
[DOCS] Changes seconds to milliseconds since the Epoch in AD docs. (#53797) 2020-03-19 15:40:53 +01:00
Tom Veasey
58340c2dbe
[ML] Adds the class_assignment_objective parameter to classification (#52763)
Adds a new parameter for classification that enables choosing whether to assign labels to
maximise accuracy or to maximise the minimum class recall.

Fixes #52427.
2020-03-12 18:39:29 +00:00
István Zoltán Szabó
77ec60baa0
[DOCS] Adds a warning about reindexing docs with the same ID to the PUT DFA docs. (#53490) 2020-03-12 18:00:36 +01:00
Benjamin Trent
4e1f029b04
[ML][Inference] adds new default_field_map field to trained models (#53294)
Adds a new `default_field_map` field to trained model config objects. 

This allows the model creator to supply field map if it knows that there should be some map for inference to work directly against the training data.

The use case internally is having analytics jobs supply a field mapping for multi-field fields. This allows us to use the model "out of the box" on data where we trained on `foo.keyword` but the `_source` only references `foo`.
2020-03-11 12:23:56 -04:00
Dimitris Athanasiou
5a32f50d18
[ML] Rename data frame analytics maximum_number_trees to max_trees (#53300)
Deprecates `maximum_number_trees` parameter of classification and
regression and replaces it with `max_trees`.
2020-03-11 10:33:53 +02:00
István Zoltán Szabó
54b66d3385
[DOCS] Makes the description clearer on how to use aggregations in an anomaly detection job (#53103)
Co-authored-by: lcawl <lcawley@elastic.co>
2020-03-09 09:48:23 +01:00
István Zoltán Szabó
08fcc0b02f
[DOCS] Adds deleting flag to the GET job stats API docs (#53223) 2020-03-06 16:03:09 +01:00
István Zoltán Szabó
870e1891d9
[DOCS] Makes the naming convention of the DFA response objects coherent (#53172) 2020-03-05 16:25:43 +01:00
István Zoltán Szabó
d7fb6416dd
[DOCS] Expands GET DFA stat API docs with response objects. (#53107) 2020-03-05 15:30:30 +01:00
Lisa Cawley
7004216455
[DOCS] Adds link in datafeed indices_options (#53067) 2020-03-03 10:28:54 -08:00
István Zoltán Szabó
24fe7e5899
[DOCS] Adds response body documentation to GET inference API (#53050) 2020-03-03 16:25:24 +01:00
Lisa Cawley
b6534834f9
[DOCS] Adds cat anomaly detectors API (#52866) 2020-02-28 12:15:21 -08:00
Dimitris Athanasiou
dd331935b3
[ML] Parse and report memory usage for DF Analytics (#52778)
Adds reporting of memory usage for data frame analytics jobs.
This commit introduces a new index pattern `.ml-stats-*` whose
first concrete index will be `.ml-stats-000001`. This index serves
to store instrumentation information for those jobs.
2020-02-28 17:35:07 +02:00
Benjamin Trent
d7a63333b5
[ML] Add indices_options to datafeed config and update (#52793)
This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index. 

This is necessary for the following use cases:
 - Reading from frozen indices
 - Allowing certain indices in multiple index patterns to not exist yet

These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object.
 
closes https://github.com/elastic/elasticsearch/issues/48056
2020-02-27 12:22:35 -05:00
Lisa Cawley
42fbca7dc6
[DOCS] Adds cat datafeeds API (#52738) 2020-02-26 09:20:36 -08:00
István Zoltán Szabó
490e8b47e6
[DOCS] Adds cat data frame analytics API (#52764)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-02-26 11:09:37 +01:00
Lisa Cawley
cd069a861c
[DOCS] Updates custom rules example (#52731) 2020-02-25 09:30:14 -08:00
David Roberts
ca80ad69f2
[ML] Use event.timezone in file_structure_finder ingest pipeline (#52720)
This is because beat.timezone was renamed to event.timezone in
elastic/beats#9458
2020-02-25 12:18:53 +00:00
lcawl
b590b49205 [DOCS] Adds anchor for custom rules 2020-02-24 10:04:34 -08:00
Lisa Cawley
f41ebe47e3
[DOCS] Clarifies description of num_top_feature_importance_values (#52246)
Co-Authored-By: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>
2020-02-18 08:48:24 -08:00
Lisa Cawley
ab139244d7
[DOCS] Fixes, sorts ML tagged regions (#52283) 2020-02-12 13:43:21 -08:00
David Kyle
f64c6359ed
[ML] Make Ensemble feature names optional (#51996)
The featureNames field is requisite in individual models but is not required by the Ensemble.
2020-02-07 10:07:18 +00:00
David Roberts
72346b91f9
[ML] Add new categorization stats to model_size_stats (#51879)
This change adds support for the following new model_size_stats
fields:

- categorized_doc_count
- total_category_count
- frequent_category_count
- rare_category_count
- dead_category_count
- categorization_status

Relates #50749
2020-02-06 17:08:43 +00:00
Darren LaCasse
ea67e24b7b
[DOCS] Remove extra word (#51757) 2020-01-31 10:27:37 -08:00
István Zoltán Szabó
67f14c3978
[DOCS] Adds PUT inference API docs (#51231)
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-01-31 13:12:24 +01:00
Lisa Cawley
32adcd2c9d
[DOCS] Adds missing testenv attribute (#51719) 2020-01-30 16:13:26 -08:00
David Roberts
a5a2e4eaee
[ML] Use CSV ingest processor in find_file_structure ingest pipeline (#51492)
Changes the find_file_structure response to include a CSV
ingest processor in the ingest pipeline it suggests.

Previously the Kibana file upload functionality parsed CSV
in the browser, but by parsing CSV in the ingest pipeline
it makes the Kibana file upload functionality more easily
interchangable with Filebeat such that the configurations
it creates can more easily be used to import data with the
same structure repeatedly in production.
2020-01-28 12:46:00 +00:00
István Zoltán Szabó
85e581282d
[DOCS] Refines description. (#51400) 2020-01-24 13:31:44 +01:00
Benjamin Trent
c9e285c1e6
[ML][Inference] add tags url param to GET (#51330)
Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint.

This parameter allows the list of models to be further reduced to those who contain all the provided tags.
2020-01-24 07:30:56 -05:00
Lisa Cawley
789aeaedab
[DOCS] Updates categorization examples with wizard screenshots (#51133) 2020-01-22 11:26:10 -08:00
Lisa Cawley
551a83a2ff
[DOCS] Clarify interval, frequency, and bucket span in ML APIs and example (#51280) 2020-01-22 08:08:31 -08:00
David Kyle
7978f0b8ef
[ML] Calculate results and snapshot retention using latest bucket timestamps (#51061)
The retention period is calculated relative to the last bucket result or snapshot
time rather than wall clock
2020-01-22 10:08:41 +00:00