elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 15:47:23 -04:00

Author	SHA1	Message	Date
Lisa Cawley	401d302c69	[DOCS] Move find file structure to a new API endpoint (#67314 )	2021-01-12 11:59:45 -08:00
Benjamin Trent	af179ab2f5	[ML] move find file structure to a new API endpoint (#67123 ) This introduces a new `text-structure` plugin. This is the new home of the find file structure API. The old REST URL is still available but is deprecated. The new URL is: `_text_structure/find_structure`. All parameters and behavior are unchanged. Changes to the high-level REST client and docs will be in separate commit. related to: https://github.com/elastic/elasticsearch/issues/67001	2021-01-11 08:56:02 -05:00
Lisa Cawley	eff9dfc3a4	[DOCS] Clarify impact of delayed data in anomaly detection (#66816 ) Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2021-01-05 12:14:51 -08:00
David Roberts	c5bef7f9a7	[ML] Deprecate anomaly detection post data endpoint (#66347 ) There is little evidence of this endpoint being used and there is quite a lot of code complexity associated with the various formats that can be used to upload data and the different errors that can occur when direct data upload is open to end users. In a future release we can make this endpoint internal so that only datafeeds can use it, and remove all the options and formats that are not used by datafeeds. End users will have to store their input data for anomaly detection in Elasticsearch indices (which we believe all do today) and use a datafeed to feed it to anomaly detection jobs.	2020-12-15 18:37:20 +00:00
Dimitris Athanasiou	3bed6661de	[ML] Add log_time to AD data_counts and decide current based on it (#66343 ) This commit is fixing a potential bug if we support anomaly detection results index rollover in the future. In particular, we determine the current `data_counts` by sorting on the latest record time. However, this is not correct if the job reverts to an older model snapshot. To fix this we add `log_time` to `data_counts` (similarly to `model_size_stats`) and sort on `log_time` to figure out the current counts for the job.	2020-12-15 19:09:13 +02:00
István Zoltán Szabó	bc989e4a86	[DOCS] Adds note about data_counts values to Revert snapshot API docs. (#66085 )	2020-12-09 10:47:51 +01:00
István Zoltán Szabó	3081cf4944	[DOCS] Adds empty snapshot_id description to revert snapshot API docs (#66036 )	2020-12-09 10:01:26 +01:00
David Kyle	22dadfd407	[ML] Docs and HRLC for datafeed runtime mappings (#65810 ) For the changes in #65606	2020-12-08 10:06:58 +00:00
David Roberts	49e492f313	[ML] Adding assignment_memory_basis to model_size_stats (#65561 ) At present the Java code makes a decision on whether to use current model memory or model memory limit to calculate how much memory a job requires to be assigned. The plan is to move this decision to the C++ code, which will report it via a new field in the model size stats. An additional change will be that once we have made the switch from using model memory limit to using current model memory we will never switch back, as this causes large fluctuations up and down in memory requirement which will be much more noticeable when autoscaling is in use. Although the only two options at present are model memory limit and current model memory, the new enum includes a third possibility, peak model memory. To switch to this now would be tricky, as there have been two bugs in the implementation of peak model memory which render its value unreliable in 7.x. However, in 8.x it might make sense to switch to using peak model memory instead of current model memory and it's much easier from a BWC perspective if the enum contains all the values from the start. Relates #63163	2020-12-03 17:18:08 +00:00
István Zoltán Szabó	a85fb5534a	[DOCS] Fixes typo in Aggregating data for faster performance. (#65354 )	2020-11-23 12:44:59 +01:00
István Zoltán Szabó	f1e54a63a1	[DOCS] Adds UI related limitation to configuring aggs docs (#65184 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-11-20 19:03:18 +01:00
István Zoltán Szabó	1e045da339	[DOCS] Makes the screenshot larger on the custom URLs page. (#65269 )	2020-11-20 09:29:39 +01:00
David Roberts	e4ce39845b	[ML] Add total ML memory to ML info (#65195 ) This change adds an extra piece of information, limits.total_ml_memory, to the ML info response. This returns the total amount of memory that ML is permitted to use for native processes across all ML nodes in the cluster. Some of this may already be in use; the value returned is total, not available ML memory.	2020-11-18 15:06:21 +00:00
Lisa Cawley	9fef6e7b7e	[DOCS] Adds new snapshot upgrade API (#65095 )	2020-11-16 09:48:07 -08:00
Benjamin Trent	33de89d94c	[ML] add new snapshot upgrader API for upgrading older snapshots (#64665 ) This new API provides a way for users to upgrade their own anomaly job model snapshots. To upgrade a snapshot the following is done: - Open a native process given the job id and the desired snapshot id - load the snapshot to the process - write the snapshot again from the native task (now updated via the native process) relates #64154	2020-11-12 10:45:56 -05:00
István Zoltán Szabó	9ed907bc75	[DOCS] Fixes example aggregation syntax in datafeed aggregations. (#64936 )	2020-11-11 16:33:36 +01:00
James Rodewig	1ea83359bb	[DOCS] Fix case for 'Boolean' (#64299 )	2020-10-29 09:04:43 -04:00
Benjamin Trent	c1de07fa83	[ML] adding new flag exclude_generated that removes generated fields in GET config APIs (#63899 ) When exporting and cloning ml configurations in a cluster it can be frustrating to remove all the fields that were generated by the plugin. Especially as the number of these fields change from version to version. This flag, exclude_generated, allows the GET config APIs to return configurations with these generated fields removed. APIs supporting this flag: - GET _ml/anomaly_detection/<job_id> - GET _ml/datafeeds/<datafeed_id> - GET _ml/data_frame/analytics/<analytics_id> The following fields are not returned in the objects: - any field that is not user settable (e.g. version, create_time) - any field that is a calculated default value (e.g. datafeed chunking_config) - any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by) relates to #63055	2020-10-20 11:28:29 -04:00
David Roberts	977a4ad3f9	[ML] Change docs test mute comment (#63866 ) The original comment mentioned issue #48583, but issue #48941 is specifically open for this mute. However, this is inappropriate, as the underlying reason the test cannot be unmuted is the same as for all the other tests skipped with the comment "Kibana sample data": issues #51572, #51576 and #51678. Closes #48941	2020-10-19 10:17:27 +01:00
Lisa Cawley	51f9bf657d	[DOCS] Fix titles for ML APIs (#63152 )	2020-10-02 11:53:49 -07:00
Benjamin Trent	7bd6e78dae	[ML] adding for_export flag for ml plugin GET resource APIs (#63092 ) This adds the new `for_export` flag to the following APIs: - GET _ml/anomaly_detection/<job_id> - GET _ml/datafeeds/<datafeed_id> - GET _ml/data_frame/analytics/<analytics_id> The flag is designed for cloning or exporting configuration objects to later be put into the same cluster or a separate cluster. The following fields are not returned in the objects: - any field that is not user settable (e.g. version, create_time) - any field that is a calculated default value (e.g. datafeed chunking_config) - any field that would effectively require changing to be of use (e.g. datafeed job_id) - any field that is automatically set via another Elastic stack process (e.g. anomaly job custom_settings.created_by) closes https://github.com/elastic/elasticsearch/issues/63055	2020-10-02 08:29:19 -04:00
Benjamin Trent	a653a1cbb8	[ML] all multiple wildcard values for GET Calendars, Events, and DELETE forecasts (#62563 ) This commit adjusts the following APIs so now they not only support an `_all` case, but wildcard patterned Ids as well. - `GET _ml/calendars/<calendar_id>/events` - `GET _ml/calendars/<calendar_id>` - `GET _ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>` - `DELETE _ml/anomaly_detectors/<job_id>/_forecast/<forecast_id>`	2020-09-18 09:39:40 -04:00
David Roberts	6008a74da5	[ML] Include the "properties" layer in find_file_structure mappings (#62158 ) Previously the "mappings" field of the response from the find_file_structure endpoint was not a drop-in for the mappings format of the create index endpoint - the "properties" layer was missing. The reason for omitting it initially was that the assumption was that the find_file_structure endpoint would only ever return very simple mappings without any nested objects. However, this will not be true in the future, as we will improve mappings detection for complex JSON objects. As a first step it makes sense to move the returned mappings closer to the standard format. This is a small building block towards fixing #55616	2020-09-09 16:29:23 +01:00
Lisa Cawley	511babde59	[DOCS] Refresh machine learning custom URL example (#61826 )	2020-09-03 16:53:26 -07:00
Lisa Cawley	f05d8c2b98	[DOCS] Per-partition categorization (#61506 )	2020-08-26 17:07:46 -07:00
lcawl	f56ab039ae	[DOCS] Fix typo in update anomaly detection job API	2020-08-25 17:12:43 -07:00
James Rodewig	6b9b8c5e31	[DOCS] Move script and stored fields content to search fields page (#60826 ) Changes: * Moves `Retrieve selected fields` to its own page and adds a title abbreviation. * Adds existing script and stored fields content to `Retrieve selected fields` * Adds a xref for `Retrieve selected fields` to `Search your data` * Adds related redirects and updates existing xrefs	2020-08-06 12:45:03 -04:00
Przemysław Witek	29ee3a05b6	Deprecate allow_no_jobs and allow_no_datafeeds in favor of allow_no_match (#60601 )	2020-08-05 12:29:07 +02:00
James Rodewig	441c3a21b1	[DOCS] Update my-index examples (#60132 ) Changes the following example index names to `my-index-000001` for consistency: * `my-index` * `my_index` * `myindex`	2020-07-27 14:46:39 -04:00
Lisa Cawley	1781d4a7b9	[DOCS] Fix security links in machine learning APIs (#60098 )	2020-07-23 12:14:56 -07:00
James Rodewig	2774cd6938	[DOCS] Swap `[float]` for `[discrete]` (#60124 ) Changes instances of `[float]` in our docs for `[discrete]`. Asciidoctor prefers the `[discrete]` tag for floating headings: https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks	2020-07-23 11:48:22 -04:00
James Rodewig	80b674fb25	[DOCS] Reformat snippets to use two-space indents (#59973 )	2020-07-21 12:24:26 -04:00
Lisa Cawley	fb0157460f	[DOCS] Changes level offset of anomaly detection pages (#59911 )	2020-07-20 16:33:54 -07:00
Lisa Cawley	823c337e76	[DOCS] Changes level offset for anomaly detection APIs (#59920 )	2020-07-20 12:38:09 -07:00
James Rodewig	2be9db01c8	[DOCS] Replace `datatype` with `data type` (#58972 )	2020-07-07 13:52:10 -04:00
Przemysław Witek	4a43b03855	Report peak model memory in ModelSizeStats (#59017 )	2020-07-06 10:33:54 +02:00
István Zoltán Szabó	3b61ec1fe2	[DOCS] Updates screenshots in ML population analysis (#58318 )	2020-06-23 09:03:31 +02:00
David Kyle	bbeda643a6	Delete expired data by job (#57337 ) Deleting expired data can take a long time leading to timeouts if there are many jobs. Often the problem is due to a few large jobs which prevent the regular maintenance of the remaining jobs. This change adds a job_id parameter to the delete expired data endpoint to help clean up those problematic jobs.	2020-06-05 13:32:35 +01:00
David Roberts	605b4d0ea9	[ML] Add per-partition categorization option (#57683 ) This PR adds the initial Java side changes to enable use of the per-partition categorization functionality added in elastic/ml-cpp#1293. There will be a followup change to complete the work, as there cannot be any end-to-end integration tests until elastic/ml-cpp#1293 is merged, and also elastic/ml-cpp#1293 does not implement some of the more peripheral functionality, like stop_on_warn and per-partition stats documents. The changes so far cover REST APIs, results object formats, HLRC and docs.	2020-06-05 11:56:15 +01:00
István Zoltán Szabó	3a15d84af9	[DOCS] Changes parameter order in model_plot_config. (#57642 )	2020-06-04 10:57:36 +02:00
Przemysław Witek	c4c094c006	Introduce ModelPlotConfig. annotations_enabled setting (#57539 )	2020-06-04 09:27:40 +02:00
Lisa Cawley	0f52cab495	[DOCS] Replaces docdir attributes in ML APIs (#57390 )	2020-06-01 11:46:10 -07:00
Benjamin Trent	ec67787a2e	[ML] add max_model_memory parameter to forecast request (#57254 ) This adds a max_model_memory setting to forecast requests. This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb"). The default value is `20mb`. There is a HARD limit at `500mb` which will throw an error if used. If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited. related native change: https://github.com/elastic/ml-cpp/pull/1238 closes: https://github.com/elastic/elasticsearch/issues/56420	2020-05-29 08:59:50 -04:00
István Zoltán Szabó	90056edaf4	[DOCS] Improves navigation between forecast APIs and adds short description. (#57035 )	2020-05-25 09:09:47 +02:00
István Zoltán Szabó	69b6041d57	[DOCS] Removes the Jobs section from the ML anomaly detection APIs page. (#57031 )	2020-05-21 17:30:59 +02:00
Benjamin Trent	8fed077b0a	[ML] relax throttling on expired data cleanup (#56711 ) Throttling nightly cleanup as much as we do has been over cautious. Night cleanup should be more lenient in its throttling. We still keep the same batch size, but now the requests per second scale with the number of data nodes. If we have more than 5 data nodes, we don't throttle at all. Additionally, the API now has `requests_per_second` and `timeout` set. So users calling the API directly can set the throttling. This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`. This will allow users to adjust throttling of the nightly maintenance.	2020-05-18 07:21:06 -04:00
David Roberts	cbb8b17d74	[DOCS] Docs changes for overridden delimiter in find_file_structure (#56288 ) Docs for #55735 Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-05-14 09:24:07 +01:00
David Roberts	c99021cdcb	[ML] More advanced model snapshot retention options (#56125 ) This PR implements the following changes to make ML model snapshot retention more flexible in advance of adding a UI for the feature in an upcoming release. - The default for `model_snapshot_retention_days` for new jobs is now 10 instead of 1 - There is a new job setting, `daily_model_snapshot_retention_after_days`, that defaults to 1 for new jobs and `model_snapshot_retention_days` for pre-7.8 jobs - For days that are older than `model_snapshot_retention_days`, all model snapshots are deleted as before - For days that are in between `daily_model_snapshot_retention_after_days` and `model_snapshot_retention_days` all but the first model snapshot for that day are deleted - The `retain` setting of model snapshots is still respected to allow selected model snapshots to be retained indefinitely Closes #52150	2020-05-05 12:55:50 +01:00
Lisa Cawley	5ef7aacbf7	[DOCS] Adds documentation for secondary authorization headers (#55365 ) Co-authored-by: Tim Vernum <tim@adjective.org>	2020-04-29 08:28:42 -07:00
István Zoltán Szabó	d70cef3474	[DOCS] Makes the footnotes less verbose in configuring aggs page. (#55857 )	2020-04-29 09:50:41 +02:00

1 2 3

129 commits