elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-30 10:23:41 -04:00

Author	SHA1	Message	Date
Lisa Cawley	c9b4499d2e	[DOCS] Add authorization details to update datafeed API (#88099 )	2022-06-28 13:43:58 -07:00
Lisa Cawley	aa19690990	[DOCS] Add authorization to anomaly detection job and datafeed API examples (#87937 )	2022-06-27 13:05:35 -07:00
Lisa Cawley	76cd7b63a4	[DOCS] Add authorization info to get anomaly detection jobs API (#87904 )	2022-06-22 15:15:33 -07:00
Benjamin Trent	115f19ff6d	[ML] adds start and end params to _preview and excludes cold/frozen tiers from unbounded previews (#86989 ) n larger clusters with complicated datafeed requirements, being able to preview only a specific window of time is important. Previously, datafeed previews would always start at 0 (or from the beginning of the data). This causes issues if the index pattern contains indices on slower hardware, but when the datafeed is actually started, the "start" time is set to more recent data (and thus on faster hardware). Additionally, when _preview is unbounded (as before), it attempts to only preview indices that are NOT frozen or cold. This is done through a query against the _tier field. Meaning, it only effects newer indices that actually have that field set.	2022-05-20 13:56:53 -04:00
Lisa Cawley	458ef91066	[DOCS] Move ML info and upgrade APIs (#84005 )	2022-02-16 11:23:00 -08:00
Ugo Sangiorgi	305ff20b8f	[DOCS] Add missing HTML anchors to CCR and ML (#80287 )	2022-01-26 11:00:40 -08:00
Lisa Cawley	91cd38df57	[DOCS] Fix links to anomaly detection docs (#82836 )	2022-01-19 17:54:18 -08:00
Lisa Cawley	c98833f9c6	[DOCS] Fix links to anomaly detection docs (#82774 )	2022-01-18 17:42:16 -08:00
Dimitris Athanasiou	93777b4e99	[ML] Add latest search interval to datafeed stats (#82620 ) This commit adds `search_interval` to the datafeed stats API `running_state` object. When the datafeed is running, it reports the last search interval that was searched. It is useful to understand the point in time where the datafeed is currently searching. Closes #82405	2022-01-16 16:04:35 +02:00
Ed Savage	a646f55c57	[ML] Set default value of 30 days for model prune window (#81377 ) For new jobs, when the analysis config field model_prune_window is not set, use a default value of 30 days or 20 times the bucket span, whichever is greater. Co-authored-by: David Roberts <dave.roberts@elastic.co> Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2021-12-20 11:27:30 +00:00
David Roberts	0559dd087b	[ML] Model snapshot upgrade needs a stats endpoint (#81641 ) Previously the ML model snapshot upgrade endpoint did not provide a way to reliably monitor progress. This could lead to the upgrade assistant UI thinking that a model snapshot upgrade had finished when it actually hadn't. This change adds a new "stats" API that allows external interested parties to find out the status of each model snapshot upgrade and which node (if any) each is running on. Fixes #81519	2021-12-14 08:31:49 +00:00
Lisa Cawley	38cbd116c9	[DOCS] Fixes query parameters for get buckets API (#80643 )	2021-11-22 11:34:43 -08:00
Lisa Cawley	f3a69ae4b1	[DOCS] Adds missing query parameters to ML APIs (#80863 )	2021-11-22 09:25:01 -08:00
Lisa Cawley	fffac5bd08	[DOCS] Adds missing query parameters in get influencer and get snapshot APIs (#80801 )	2021-11-18 08:24:24 -08:00
Lisa Cawley	d6f48dc5bd	[DOCS] Add query parameters to update datafeed API (#80777 )	2021-11-17 07:40:31 -08:00
Lisa Cawley	6ecc495d15	[DOCS] Clarify parameters in delete expired data, forecast, and flush job APIs (#80517 )	2021-11-09 14:57:35 -08:00
Lisa Cawley	1c98a23ca8	[DOCS] Edits stop and start datafeed APIs (#80461 )	2021-11-09 14:39:13 -08:00
Lisa Cawley	733381bed2	[DOCS] Adds missing query parameters to datafeed APIs (#80314 )	2021-11-05 16:31:04 -07:00
James Rodewig	f56a0f4b66	[DOCS] Remove `testenv` annotations from doc snippet tests (#80023 ) Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible. Relates to #79309, #31619	2021-11-05 18:38:50 -04:00
István Zoltán Szabó	f72e2da221	[DOCS] Adds missing query params to GET category and GET influencer APIs (#79448 )	2021-11-05 10:59:57 +01:00
Lisa Cawley	cadc0c3800	[DOCS] Fixes typo in preview datafeed API (#79863 )	2021-10-26 16:48:06 -07:00
Lisa Cawley	3d6074b76e	[DOCS] Fixes typo in calendar API example (#78867 )	2021-10-07 17:51:14 -07:00
Lisa Cawley	df5dde5b3c	[DOCS] Fixes ML get calendars API (#78808 )	2021-10-07 12:22:11 -07:00
Lisa Cawley	bcd75c3203	[DOCS] Fixes ML get scheduled events API (#78809 )	2021-10-07 08:34:58 -07:00
Benjamin Trent	281ec58b8d	[ML] add new default char filter `first_line_with_letters` for machine learning categorization (#77457 ) The char filter replaces the previous default of `first_non_blank_line`. `first_non_blank_line` worked well to figure out what line had characters at all, but log lines like the following were handled poorly: ``` -------------------------------------------------------------------------------- Alias 'foo' already exists and this prevents setting up ILM for logs -------------------------------------------------------------------------------- ``` When combined with the `ml_standard` tokenizer, the first line was used: ``` -------------------------------------------------------------------------------- ``` This has no valid tokens for our standard tokenizer. Consequently, no tokens were found by `ml_standard` tokenizer. The new filter, `first_line_with_letters`, returns the first line with any letter character (e.g. `Character#isLetter` returns true). Given the previously poorly handled log, when combining with our `ml_standard` tokenizer, we get the following, more appropriate, tokens: ``` "tokens" : ["Alias", "foo", "already", "exists", "and", "this", "prevents", "setting", "up", "ILM", "for", "logs"] ```	2021-09-09 10:09:57 -04:00
Lisa Cawley	d36f24fbc3	[DOCS] Update datafeed details in ML docs (#76854 )	2021-08-25 11:35:21 -07:00
David Roberts	7ac5ea39df	[ML] Use results retention time for deleting system annotations (#76096 ) In #75617 a new setting, system_annotations_retention_days, was added to control how long system annotations are retained for. We now feel that this setting is redundant and that system annotations should be retained for the same period as results. This is intuitive and defensible, as system annotations can be considered a type of result. Followup to #75617	2021-08-04 17:42:31 +01:00
David Roberts	10a1d27c7b	[ML] Deleting a job now deletes the datafeed if necessary (#76010 ) Previously attempting to delete a job that had a datafeed would return an exception. However, this was unnecessarily pedantic - the user would always want to delete both job and datafeed together, and would react by deleting the datafeed and then subsequently deleting the job again. This change makes the delete job API automatically delete a datafeed associated with the job. The same level of force is used for this delete datafeed request as was used on the delete job request. This means that it's possible to force-delete an open job with a started datafeed (since force-delete datafeed will automatically stop a started datafeed). It's still not possible to delete an opened job without using force.	2021-08-03 17:22:06 +01:00
Ed Savage	5651215be1	[ML] Add 'model_prune_window' field to AD job config (#75741 ) Add configuration for pruning dead split fields in anomaly detection jobs via the `model_prune_window` field for both the job creation and update APIs. Relates to ml-cpp/#1962	2021-08-03 09:16:43 +01:00
Przemysław Witek	30d9f13436	[ML] Delete expired annotations (#75617 )	2021-07-29 15:27:03 +02:00
Lisa Cawley	70b870ee7f	[DOCS] Fixes nesting of datafeed config in APIs (#75502 )	2021-07-20 11:27:15 -07:00
István Zoltán Szabó	9ef156df9f	[DOCS] Adds peak_model_bytes and assignment_memory_basis to GET model snapshot API docs (#75413 )	2021-07-16 17:12:47 +02:00
Lisa Cawley	3c76bcb3a5	[DOCS] Fixes links to machine learning concepts (#75194 )	2021-07-09 13:09:03 -07:00
Lisa Cawley	b71b7d0866	[DOCS] Fix links to anomaly detection overview (#74943 )	2021-07-05 13:19:54 -07:00
Lisa Cawley	4c85852cc7	[DOCS] Update forecasting links in ML APIs (#74942 )	2021-07-05 12:34:03 -07:00
Lisa Cawley	64af39b759	[DOCS] Add memory limit details in update job API (#74517 ) Co-authored-by: David Roberts <dave.roberts@elastic.co>	2021-06-24 08:50:19 -07:00
Benjamin Trent	0303e6d733	[ML] add datafeed field to the job config (#74265 ) This is a quality of life improvement for typical users. Almost all anomaly jobs will receive their data through a datafeed. The datafeed config can now be supplied and is available in the datafeed field in the job config for creation and getting jobs.	2021-06-23 08:06:58 -04:00
David Roberts	6e9b959450	[ML] Closing an anomaly detection job now automatically stops its datafeed if necessary (#74257 ) Previously it was a requirement of the close job API that if the job had an associated datafeed that that datafeed was stopped before the job could be closed. Experience has shown that this is just a pedantic nuisance. If a user closes the job without first stopping the datafeed then it's just a mistake, and they then have to make two further calls, to stop the datafeed and then attempt to close the job again. This PR changes the behaviour so that if you ask to close a job whose datafeed is running then the datafeed gets stopped first as part of the same call. Datafeeds are stopped with the same level of force as the job close request specified.	2021-06-22 12:56:11 +01:00
Dimitris Athanasiou	dc61a72c9e	[ML] Reset anomaly detection job API (#73908 ) Adds a new API that allows a user to reset an anomaly detection job. To use the API do: ``` POST _ml/anomaly_detectors/<job_id>_reset ``` The API removes all data associated to the job. In particular, it deletes model state, results and stats. However, job notifications and user annotations are not removed. Also, the API can be called asynchronously by setting the parameter `wait_for_completion` to `false` (defaults to `true`). When run that way the API returns the task id for further monitoring. In order to prevent the job from opening while it is resetting, a new job field has been added called `blocked`. It is an object that contains a `reason` and the `task_id`. `reason` can take a value from ["delete", "reset", "revert"] as all these operations should block the job from opening. The `task_id` is also included in order to allow tracking the task if necessary. Finally, this commit also sets the `blocked` field when the revert snapshot API is called as a job should not be opened while it is reverted to a different model snapshot.	2021-06-14 18:56:28 +03:00
Benjamin Trent	8d882863d7	[ML] adding running_state to datafeed stats object (#73926 ) It is useful to know the following information when reading datafeed stats: - Is the datafeed a "real-time" datafeed, i.e. a datafeed without a configured `end` time - Has the datafeed processed all past data available at the time of starting. This object is only available if the datafeed task has been created. It has the form: ``` "running_state": { "is_real_time": <boolean>, "look_back_finished": <boolean> } ```	2021-06-10 08:08:49 -04:00
Lisa Cawley	a6339918ac	[DOCS] Adds defaults to get ML results APIs (#73540 ) Co-authored-by: David Roberts <dave.roberts@elastic.co>	2021-06-03 10:05:47 -07:00
David Roberts	0059c59e25	[ML] Make ml_standard tokenizer the default for new categorization jobs (#72805 ) Categorization jobs created once the entire cluster is upgraded to version 7.14 or higher will default to using the new ml_standard tokenizer rather than the previous default of the ml_classic tokenizer, and will incorporate the new first_non_blank_line char filter so that categorization is based purely on the first non-blank line of each message. The difference between the ml_classic and ml_standard tokenizers is that ml_classic splits on slashes and colons, so creates multiple tokens from URLs and filesystem paths, whereas ml_standard attempts to keep URLs, email addresses and filesystem paths as single tokens. It is still possible to config the ml_classic tokenizer if you prefer: just provide a categorization_analyzer within your analysis_config and whichever tokenizer you choose (which could be ml_classic or any other Elasticsearch tokenizer) will be used. To opt out of using first_non_blank_line as a default char filter, you must explicitly specify a categorization_analyzer that does not include it. If no categorization_analyzer is specified but categorization_filters are specified then the categorization filters are converted to char filters applied that are applied after first_non_blank_line. Closes elastic/ml-cpp#1724	2021-06-01 15:11:32 +01:00
István Zoltán Szabó	d07c174aaf	[DOCS] Revises required privileges info in Anomaly Detection API docs (#72483 )	2021-05-03 10:20:14 +02:00
Benjamin Trent	2ce4d175f0	[ML] increase the default value of xpack.ml.max_open_jobs from 20 to 512 for autoscaling improvements (#72487 ) This commit increases the xpack.ml.max_open_jobs from 20 to 512. Additionally, it ignores nodes that cannot provide an accurate view into their native memory. If a node does not have a view into its native memory, we ignore it for assignment. This effectively fixes a bug with autoscaling. Autoscaling relies on jobs with adequate memory to assign jobs to nodes. If that is hampered by the xpack.ml.max_open_jobs scaling decisions are hampered.	2021-04-30 07:55:57 -04:00
Benjamin Trent	01fc8ed246	[ML] adding ability to update runtime_mappings via datafeed config update API (#71707 ) Adds runtime_mappings as an updatable field via datafeed config update. closes: #71702	2021-04-15 09:44:34 -04:00
James Rodewig	693807a6d3	[DOCS] Fix double spaces (#71082 )	2021-03-31 09:57:47 -04:00
Benjamin Trent	b796632582	[ML] Allow datafeed and job configs for datafeed preview API (#70836 ) Previously, a datafeed and job must already exist for the `_preview` API to work. With this change, users can get an accurate preview of the data that will be sent to the anomaly detection job without creating either of them. closes https://github.com/elastic/elasticsearch/issues/70264	2021-03-26 12:52:23 -04:00
James Rodewig	5c75d004fa	[DOCS] Replace `put` with `create or update` in API names (#70330 ) Co-authored-by: debadair <debadair@elastic.co> Co-authored-by: Lisa Cawley <lcawley@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2021-03-15 14:49:44 -04:00
Lisa Cawley	2caba7b11f	[DOCS] Edits machine learning settings (#69947 ) Co-authored-by: David Roberts <dave.roberts@elastic.co>	2021-03-09 10:59:12 -08:00
Joe Gallo	1e8b5fa7c2	Remove the _ml/find-file-structure docs (#69823 )	2021-03-03 09:49:28 -05:00

1 2 3 4

155 commits