Commit graph

552 commits

Author SHA1 Message Date
Lisa Cawley
a57db419f4
[DOCS] Remove experimental tag from find structure API (#68153) (#68156) 2021-01-28 13:26:46 -08:00
Dimitris Athanasiou
9e55623c29
[7.x][ML] Expand regression/classification hyperparameters (#67950) (#67983)
Expands data frame analytics regression and classification
analyses with the followin hyperparameters:

- alpha
- downsample_factor
- eta_growth_rate_per_tree
- max_optimization_rounds_per_hyperparameter
- soft_tree_depth_limit
- soft_tree_depth_tolerance

Backport of #67950
2021-01-26 15:48:13 +02:00
Benjamin Trent
a324055310
[7.x] [ML] move find file structure finder in Rest high Level client to its new endpoint and plugin (#67290) (#67510)
* [ML] move find file structure finder in Rest high Level client to its new endpoint and plugin (#67290)

Find file structure finder is now its own plugin, and separated from the ml plugin.

This commit updates the rest high level client to reflect this.

Additionally, this adjusts the internal and client object names from `FileStructure` to the more general `TextStructure`
2021-01-14 09:59:34 -05:00
Yang Wang
f0715f9a4b
Deprecate the id field for the InvalidateApiKey API (#66317) (#66670)
This PR deprecates the usage of the id field in the payload for the
InvalidateApiKey API. The ids field introduced in #63224 is now the recommended
way for performing (bulk) API key invalidation.

This PR also includes the test fix from #66696
2020-12-22 12:56:40 +11:00
David Kyle
5fec2538ca
[ML] Docs and HRLC for datafeed runtime mappings (#65810) (#66007)
For the changes in #65606
2020-12-08 11:04:21 +00:00
James Rodewig
46c99eca51
[DOCS] Fix typos (#65124) (#65153)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: Johnny Lim <izeye@naver.com>
2020-11-17 12:52:11 -05:00
Benjamin Trent
39f5f39dc2
[7.x] [ML] add new snapshot upgrader API for upgrading older snapshots (#64665) (#65010)
* [ML] add new snapshot upgrader API for upgrading older snapshots (#64665)

This new API provides a way for users to upgrade their own anomaly job
model snapshots.

To upgrade a snapshot the following is done:
- Open a native process given the job id and the desired snapshot id
- load the snapshot to the process
- write the snapshot again from the native task (now updated via the
  native process)

relates #64154
2020-11-17 11:30:47 -05:00
James Rodewig
944568e3dd
[DOCS] Fix "the the" typos (#64344) (#64353)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-10-29 11:50:02 -04:00
Jake Landis
86834e6b92
[7.x] Update getting-started.asciidoc for Java version (#63106) (#64084)
Update client documentation to state "at least" Java 1.8

Co-authored-by: junmuz <mjunaidmuzammil@gmail.com>
2020-10-27 11:52:02 -05:00
István Zoltán Szabó
b822e582c3
[DOCS] Changes experimental flag to beta in DFA related docs (#63992) (#64176) 2020-10-26 18:04:21 +01:00
Benjamin Trent
b9dc522cb4
[7.x] [ML] adding new flag exclude_generated that removes generated fields in GET config APIs (#63899)(#63092) (#63177)
* [ML] adding for_export flag for ml plugin GET resource APIs (#63092)

This adds the new `for_export` flag to the following APIs:

- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The flag is designed for cloning or exporting configuration objects to later be put into the same cluster or a separate cluster.

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that would effectively require changing to be of use (e.g. datafeed job_id)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

closes https://github.com/elastic/elasticsearch/issues/63055

* [ML] adding new flag exclude_generated that removes generated fields in GET config APIs (#63899)

When exporting and cloning ml configurations in a cluster it can be
frustrating to remove all the fields that were generated by
the plugin. Especially as the number of these fields change
from version to version.

This flag, exclude_generated, allows the GET config APIs to return
configurations with these generated fields removed.

APIs supporting this flag:
- GET _ml/anomaly_detection/<job_id>
- GET _ml/datafeeds/<datafeed_id>
- GET _ml/data_frame/analytics/<analytics_id>

The following fields are not returned in the objects:

- any field that is not user settable (e.g. version, create_time)
- any field that is a calculated default value (e.g. datafeed chunking_config)
- any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by)

relates to #63055
2020-10-20 12:42:52 -04:00
Benjamin Trent
b92cbcd41a
[Transform] add new exclude_generated flag to GET transform (#63093) (#63947)
This adds a new flag `exclude_generated` for GET transform API.

This flag is useful for when a transform needs to be cloned within a cluster or exported/imported between clusters.

It removes certain fields that are not able to be set via the PUT api (e.g. version, create_time).

relates https://github.com/elastic/elasticsearch/issues/63055
2020-10-20 12:38:41 -04:00
Lyudmila Fokina
e518bd76e7
Adding authentication information to access token create APIs (#62490) (#63841)
* Adding authentication information to access token create APIs (#62490)

* Adding authentication information to access token create APIs

Adding authentication object to following APIs:
/_security/oauth2/token
/_security/delegate_pki
/_security/saml/authenticate
/_security/oidc/authenticate

Resolves: #59685
(cherry picked from commit 51dbd9e584)

* Addressing PR commends, fixing tests

* Returning tokenGroups attribute as SID string instead of byte array (AD metadata)

Addressing PR comments

* Returning tokenGroups attribute as SID string instead of byte array (AD metadata)

Update version check

* Returning tokenGroups attribute as SID string instead of byte array (AD metadata)

Update version check

* Addressing more PR comments

* Adding more to integration tests + some small fixes

* Nit fixes and formatting following #62490 comments (#63797)

* Nit fixes and formatting following #62490 comments

Resolves: #63792

* Nit fixes and formatting following #62490 comments

Resolves: #63792

* Nit fixes and formatting following #62490 comments
Fixing username

* Nit fixes and formatting following #62490 comments
Fixing formatting

* Fixing merge conflicts

* Fixing merge conflicts
2020-10-16 20:50:03 +02:00
Przemysław Witek
bb7df2eb5f
[ML] Allow setting num_top_classes to a special value -1 (#63587) (#63601) 2020-10-13 14:00:12 +02:00
Przemysław Witek
a97bd5b787
[7.x] [ML] Validate that AucRoc has the data necessary to be calculated (#63302) (#63453) 2020-10-08 09:31:45 +02:00
Lisa Cawley
8f76c89cd3
[7.x][DOCS] Add feature_importance_baseline to get trained model API (#63279) (#63336)
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2020-10-06 10:08:34 -07:00
Yang Wang
7969fbb4ab
Cache API key doc to reduce traffic to the security index (#59376) (#63319)
Getting the API key document form the security index is the most time consuing part
of the API Key authentication flow (>60% if index is local and >90% if index is remote).
This traffic is now avoided by caching added with this PR.

Additionally, we add a cache invalidator registry so that clearing of different caches will
be managed in a single place (requires follow-up PRs).
2020-10-06 23:49:23 +11:00
Lisa Cawley
22aea11016 [DOCS] Add experimental tag to rollup APIs (#63206) 2020-10-05 13:22:11 -07:00
Lisa Cawley
ce23c38e96
[DOCS] Add find file structure API to HLRC docs (#63212) (#63261) 2020-10-05 11:37:44 -07:00
Lisa Cawley
4de6104dae
[DOCS] Fix titles for ML APIs (#63152) (#63207) 2020-10-02 14:01:01 -07:00
Lisa Cawley
57ea5d27ae [DOCS] Add experimental tag to data frame analytics APIs (#63153) 2020-10-02 09:44:40 -07:00
Benjamin Trent
cfcf973259
[7.x] [ML] renames */inference* apis to */trained_models* (#63097) (#63136)
* [ML] renames */inference* apis to */trained_models* (#63097)

This commit renames all `inference` CRUD APIs to `trained_models`.

This aligns with internal terminology, documentation, and use-cases.
2020-10-02 07:34:28 -04:00
Przemysław Witek
d677a2b8ee
[7.x] [ML] Implement AucRoc metric for classification - HLRC (#62304) (#63058) 2020-09-30 14:04:10 +02:00
Benjamin Trent
e163559e4c
[7.x] [ML] Add new include flag to GET inference/<model_id> API for model training metadata (#61922) (#62620)
* [ML] Add new include flag to GET inference/<model_id> API for model training metadata (#61922)

Adds new flag include to the get trained models API
The flag initially has two valid values: definition, total_feature_importance.
Consequently, the old include_model_definition flag is now deprecated.
When total_feature_importance is included, the total_feature_importance field is included in the model metadata object.
Including definition is the same as previously setting include_model_definition=true.

* fixing test

* Update x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ml/action/GetTrainedModelsRequestTests.java
2020-09-18 10:07:35 -04:00
Lisa Cawley
bc5eec8205
[DOCS] Fix capitalization in HLRC ML APIs (#62010) (#62012) 2020-09-04 16:57:15 -07:00
James Rodewig
054a64d66f
[DOCS] Fix old NodeSelector field in Low Level REST Client (#61551) (#61718)
Co-authored-by: Manabu Matsuzaki <matsumana@users.noreply.github.com>
2020-08-31 10:07:58 -04:00
Benjamin Trent
1ae2923632
[7.x] [ML] adding docs + hlrc for data frame analysis feature_processors (#61149) (#61493)
* [ML] adding docs + hlrc for data frame analysis feature_processors (#61149)

Adds HLRC and some docs for the new feature_processors field in Data frame analytics.

Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co>
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-08-24 12:56:21 -04:00
James Rodewig
da89ff87bb
[DOCS] Prune Search your data content (#61303) (#61462)
Changes:
* Removes narrative around URI searches. These aren't commonly used in production. The `q` param is already covered in the search API docs: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-search.html#search-api-query-params-q
* Adds a common options section that highlights narrative docs for query DSL, aggregations, multi-index search, search fields, pagination, sorting, and async search.
* Adds a `Search shard routing` page. Moves narrative docs for adaptive replica selection, preference, routing , and shard limits to that section.
* Moves search timeout and cancellation content to the `Search your data` page.
* Creates a `Search multiple data streams and indices` page. Moves related narrative docs for multi-target syntax searches and `indices_boost` to that page.
* Removes narrative examples for the `search_type` parameters. Moves documentation for this parameter to the search API docs.
2020-08-24 09:31:53 -04:00
Yang Wang
cd52233b94
Include authentication type for the authenticate response (#61247) (#61411)
Add a new "authentication_type" field to the response of "GET _security/_authenticate".
2020-08-21 22:59:43 +10:00
James Rodewig
e63c12f443
[DOCS] Fix typo in Java HLRC docs (#60863) (#61264)
Co-authored-by: bumjin <bumjin@gmail.com>
2020-08-18 09:09:10 -04:00
James Rodewig
60876a0e32
[DOCS] Replace Wikipedia links with attribute (#61171) (#61209) 2020-08-17 11:27:04 -04:00
James Rodewig
929f1cc9f9
[DOCS] Remove search request body page (#60972) (#60977) 2020-08-11 13:04:07 -04:00
Mark Tozzi
fb7c431d8d
[DOCS] Update snapshot repo usage (#60791) (#60831)
Clarify how to use our snapshot repository.  Several folks were confused about this just now, including myself.
2020-08-06 12:29:21 -04:00
James Rodewig
5a2c6f0d4f
[DOCS] http -> https, remove outdated plugin docs (#60380) (#60545)
Plugin discovery documentation contained information about installing
Elasticsearch 2.0 and installing an oracle JDK, both of which is no
longer valid.

While noticing that the instructions used cleartext HTTP to install
packages, this commit replaces HTTPs links instead of HTTP where possible.

In addition a few community links have been removed, as they do not seem
to exist anymore.

Co-authored-by: Alexander Reelsen <alexander@reelsen.net>
2020-07-31 16:16:31 -04:00
Lisa Cawley
46d33b1586
[DOCS] 7.9.0 release notes (#60053) 2020-07-22 08:40:59 -07:00
Przemysław Witek
283a1f605c
Rename binary_soft_classification evaluation to outlier_detection (#59951) (#59970) 2020-07-21 15:15:04 +02:00
James Baiera
55c4dec360
LLRC RequestOptions add RequestConfig (#57972) (#59344)
Different kinds of requests may need different request options from the client 
default. Users can optionally set RequestConfig on a single request's 
RequestOptions to override the default. Without this, socketTimeout can only 
set at RestClient initialization.

Co-authored-by: weizijun <weizijun1989@gmail.com>
2020-07-14 13:34:53 -04:00
David Kyle
a6a27b76dc
Fix broken links to aggregation javadoc (#59083) (#59319)
Fixes links from the Java High Level Rest Client to the aggregations java docs
2020-07-11 13:28:03 +01:00
Dimitris Athanasiou
b2243337d8
[7.x][ML] Data frame analytics max_num_threads setting (#59254) (#59308)
This adds a setting to data frame analytics jobs called
`max_number_threads`. The setting expects a positive integer.
When used the user specifies the max number of threads that may
be used by the analysis. Note that the actual number of threads
used is limited by the number of processors on the node where
the job is assigned. Also, the process may use a couple more threads
for operational functionality that is not the analysis itself.

This setting may also be updated for a stopped job.

More threads may reduce the time it takes to complete the job at the cost
of using more CPU.

Backport of #59254 and #57274
2020-07-09 19:15:46 +03:00
David Kyle
c5443f78ce
Add Inference Pipeline aggregation to HLRC (#59086) (#59250)
Adds InferencePipelineAggregationBuilder to the HLRC duplicating 
the server side classes
2020-07-09 13:38:45 +01:00
Yang Wang
a5a8b4ae1d
Add cache for application privileges (#55836) (#58798)
Add caching support for application privileges to reduce number of round-trips to security index when building application privilege descriptors.

Privilege retrieving in NativePrivilegeStore is changed to always fetching all privilege documents for a given application. The caching is applied to all places including "get privilege", "has privileges" APIs and CompositeRolesStore (for authentication).
2020-07-02 11:50:03 +10:00
Przemysław Witek
909649dd15
[7.x] Implement pseudo Huber loss (PseudoHuber) evaluation metric for regression analysis (#58734) (#58825) 2020-07-01 14:52:06 +02:00
Przemysław Witek
9ea9b7bd3b
[7.x] Implement MSLE (MeanSquaredLogarithmicError) evaluation metric for regression analysis (#58684) (#58731) 2020-06-30 14:09:11 +02:00
Przemysław Witek
3f7c45472e
[7.x] Introduce DataFrameAnalyticsConfig update API (#58302) (#58648) 2020-06-29 10:56:11 +02:00
Jason Tedor
be08268562
Allow follower indices to override leader settings (#58103)
Today when creating a follower index via the put follow API, or via an
auto-follow pattern, it is not possible to specify settings overrides
for the follower index. Instead, we copy all of the leader index
settings to the follower. Yet, there are cases where a user would want
some different settings on the follower index such as the number of
replicas, or allocation settings. This commit addresses this by allowing
the user to specify settings overrides when creating follower index via
manual put follower calls, or via auto-follow patterns. Note that not
all settings can be overrode (e.g., index.number_of_shards) so we also
have detection that prevents attempting to override settings that must
be equal between the leader and follow index. Note that we do not even
allow specifying such settings in the overrides, even if they are
specified to be equal between the leader and the follower
index. Instead, the must be implicitly copied from the leader index, not
explicitly set by the user.
2020-06-18 11:56:06 -04:00
David Kyle
39020f3900
HLRC for delete expired data by job Id (#57722) (#57975)
High level rest client changes for #57337
2020-06-12 09:44:17 +01:00
Dimitris Athanasiou
f49a14ce6f
[7.x][ML] Fix race condition when force stopping DF analytics job (#57680) (#57717)
When we force delete a DF analytics job, we currently first force
stop it and then we proceed with deleting the job config.
This may result in logging errors if the job config is deleted
before it is retrieved while the job is starting.

Instead of force stopping the job, it would make more sense to
try to stop the job gracefully first. So we now try that out first.
If normal stop fails, then we resort to force stopping the job to
ensure we can go through with the delete.

In addition, this commit introduces `timeout` for the delete action
and makes use of it in the child requests.

Backport of #57680
2020-06-05 17:50:01 +03:00
Lisa Cawley
db5bf92acf
[7.x][DOCS] Replace docdir attribute with es-repo-dir (#57489) (#57494) 2020-06-01 16:42:53 -07:00
Benjamin Trent
35d5126cea
[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351) (#57368)
* [ML] adds new for_export flag to GET _ml/inference API (#57351)

Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API.

This flag is useful for moving models between clusters.
2020-05-29 14:01:08 -04:00
Benjamin Trent
c8374dc9f3
[ML] add max_model_memory parameter to forecast request (#57254) (#57355)
This adds a max_model_memory setting to forecast requests. 
This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb").

The default value is `20mb`.

There is a HARD limit at `500mb` which will throw an error if used.

If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited.

related native change: https://github.com/elastic/ml-cpp/pull/1238

closes: https://github.com/elastic/elasticsearch/issues/56420
2020-05-29 11:16:08 -04:00