elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-07-14 17:53:40 -04:00

Author	SHA1	Message	Date
István Zoltán Szabó	3b61ec1fe2	[DOCS] Updates screenshots in ML population analysis (#58318 )	2020-06-23 09:03:31 +02:00
David Kyle	bbeda643a6	Delete expired data by job (#57337 ) Deleting expired data can take a long time leading to timeouts if there are many jobs. Often the problem is due to a few large jobs which prevent the regular maintenance of the remaining jobs. This change adds a job_id parameter to the delete expired data endpoint to help clean up those problematic jobs.	2020-06-05 13:32:35 +01:00
David Roberts	605b4d0ea9	[ML] Add per-partition categorization option (#57683 ) This PR adds the initial Java side changes to enable use of the per-partition categorization functionality added in elastic/ml-cpp#1293. There will be a followup change to complete the work, as there cannot be any end-to-end integration tests until elastic/ml-cpp#1293 is merged, and also elastic/ml-cpp#1293 does not implement some of the more peripheral functionality, like stop_on_warn and per-partition stats documents. The changes so far cover REST APIs, results object formats, HLRC and docs.	2020-06-05 11:56:15 +01:00
István Zoltán Szabó	3a15d84af9	[DOCS] Changes parameter order in model_plot_config. (#57642 )	2020-06-04 10:57:36 +02:00
Przemysław Witek	c4c094c006	Introduce ModelPlotConfig. annotations_enabled setting (#57539 )	2020-06-04 09:27:40 +02:00
Lisa Cawley	0f52cab495	[DOCS] Replaces docdir attributes in ML APIs (#57390 )	2020-06-01 11:46:10 -07:00
Benjamin Trent	ec67787a2e	[ML] add max_model_memory parameter to forecast request (#57254 ) This adds a max_model_memory setting to forecast requests. This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb"). The default value is `20mb`. There is a HARD limit at `500mb` which will throw an error if used. If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited. related native change: https://github.com/elastic/ml-cpp/pull/1238 closes: https://github.com/elastic/elasticsearch/issues/56420	2020-05-29 08:59:50 -04:00
István Zoltán Szabó	90056edaf4	[DOCS] Improves navigation between forecast APIs and adds short description. (#57035 )	2020-05-25 09:09:47 +02:00
István Zoltán Szabó	69b6041d57	[DOCS] Removes the Jobs section from the ML anomaly detection APIs page. (#57031 )	2020-05-21 17:30:59 +02:00
Benjamin Trent	8fed077b0a	[ML] relax throttling on expired data cleanup (#56711 ) Throttling nightly cleanup as much as we do has been over cautious. Night cleanup should be more lenient in its throttling. We still keep the same batch size, but now the requests per second scale with the number of data nodes. If we have more than 5 data nodes, we don't throttle at all. Additionally, the API now has `requests_per_second` and `timeout` set. So users calling the API directly can set the throttling. This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`. This will allow users to adjust throttling of the nightly maintenance.	2020-05-18 07:21:06 -04:00
David Roberts	cbb8b17d74	[DOCS] Docs changes for overridden delimiter in find_file_structure (#56288 ) Docs for #55735 Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-05-14 09:24:07 +01:00
David Roberts	c99021cdcb	[ML] More advanced model snapshot retention options (#56125 ) This PR implements the following changes to make ML model snapshot retention more flexible in advance of adding a UI for the feature in an upcoming release. - The default for `model_snapshot_retention_days` for new jobs is now 10 instead of 1 - There is a new job setting, `daily_model_snapshot_retention_after_days`, that defaults to 1 for new jobs and `model_snapshot_retention_days` for pre-7.8 jobs - For days that are older than `model_snapshot_retention_days`, all model snapshots are deleted as before - For days that are in between `daily_model_snapshot_retention_after_days` and `model_snapshot_retention_days` all but the first model snapshot for that day are deleted - The `retain` setting of model snapshots is still respected to allow selected model snapshots to be retained indefinitely Closes #52150	2020-05-05 12:55:50 +01:00
Lisa Cawley	5ef7aacbf7	[DOCS] Adds documentation for secondary authorization headers (#55365 ) Co-authored-by: Tim Vernum <tim@adjective.org>	2020-04-29 08:28:42 -07:00
István Zoltán Szabó	d70cef3474	[DOCS] Makes the footnotes less verbose in configuring aggs page. (#55857 )	2020-04-29 09:50:41 +02:00
David Roberts	dcb6ed03cd	[ML] Adding failed_category_count to model_size_stats (#55716 ) The failed_category_count statistic records the number of times categorization wanted to create a new category but couldn't because the job had reached its model_memory_limit. Relates elastic/ml-cpp#1130	2020-04-25 08:01:21 +01:00
Lisa Cawley	7fafec0f8f	[DOCS] Update example and nesting in get data frame analytics job stats API (#55191 ) Co-Authored-By: Valeriy Khakhutskyy <1292899+valeriy42@users.noreply.github.com>	2020-04-22 08:07:31 -07:00
David Roberts	d1a9b3a545	[ML] Add effective max model memory limit to ML info (#55529 ) The ML info endpoint returns the max_model_memory_limit setting if one is configured. However, it is still possible to create a job that cannot run anywhere in the current cluster because no node in the cluster has enough memory to accommodate it. This change adds an extra piece of information, limits.effective_max_model_memory_limit, to the ML info response that returns the biggest model memory limit that could be run in the current cluster assuming no other jobs were running. The idea is that the ML UI will be able to warn users who try to create jobs with higher model memory limits that their jobs will not be able to start unless they add a bigger ML node to their cluster. Relates elastic/kibana#63942	2020-04-22 11:36:58 +01:00
David Roberts	8906e76079	[ML] Return assigned node in start/open job/datafeed response (#55473 ) Adds a "node" field to the response from the following endpoints: 1. Open anomaly detection job 2. Start datafeed 3. Start data frame analytics job If the job or datafeed is assigned to a node immediately then this field will return the ID of that node. In the case where a job or datafeed is opened or started lazily the node field will contain an empty string. Clients that want to test whether a job or datafeed was opened or started lazily can therefore check for this. Fixes #54067	2020-04-22 08:44:57 +01:00
István Zoltán Szabó	f8bfab2dab	[DOCS] Provides further details on aggregations in datafeeds (#55462 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-04-22 08:53:34 +02:00
István Zoltán Szabó	bb44726ad6	[DOCS] Reworks some parts of EMM API docs (#54872 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-04-08 09:50:12 +02:00
Benjamin Trent	1d24960ff8	[ML] prefer secondary authorization header for data[feed\|frame] authz (#54121 ) Secondary authorization headers are to be used to facilitate Kibana spaces support + ML jobs/datafeeds. Now on PUT/Update/Preview datafeed, and PUT data frame analytics the secondary authorization is preferred over the primary (if provided). closes https://github.com/elastic/elasticsearch/issues/53801	2020-04-02 10:10:46 -04:00
Benjamin Trent	bbd6e943de	[ML] add num_matches and preferred_to_categories to category defintion objects (#54214 ) This adds two new fields to category definitions. - `num_matches` indicating how many documents have been seen by this category - `preferred_to_categories` indicating which other categories this particular category supersedes when messages are categorized. These fields are only guaranteed to be up to date after a `_flush` or `_close` native change: https://github.com/elastic/ml-cpp/pull/1062	2020-04-02 07:49:09 -04:00
István Zoltán Szabó	b0f6d4ee0e	[DOCS] Updates estimate model memory docs (#54574 )	2020-04-01 15:53:53 +02:00
Jason Tedor	95a7eed9aa	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 15:52:01 -04:00
Lisa Cawley	fdcd19483d	[DOCS] Collapses content in machine learning APIs (#54234 )	2020-03-30 10:08:38 -07:00
David Roberts	8ee770560a	[ML] Add a model memory estimation endpoint for anomaly detection (#53507 ) A new endpoint for estimating anomaly detection job model memory requirements: POST _ml/anomaly_detectors/estimate_model_memory Closes #53219	2020-03-24 21:38:19 +00:00
István Zoltán Szabó	8279f82dea	[DOCS] Fixes typo in start datafeed API docs. (#53811 )	2020-03-19 17:55:26 +01:00
István Zoltán Szabó	57321124ea	[DOCS] Changes seconds to milliseconds since the Epoch in AD docs. (#53797 )	2020-03-19 15:40:53 +01:00
István Zoltán Szabó	54b66d3385	[DOCS] Makes the description clearer on how to use aggregations in an anomaly detection job (#53103 ) Co-authored-by: lcawl <lcawley@elastic.co>	2020-03-09 09:48:23 +01:00
István Zoltán Szabó	08fcc0b02f	[DOCS] Adds deleting flag to the GET job stats API docs (#53223 )	2020-03-06 16:03:09 +01:00
Lisa Cawley	b6534834f9	[DOCS] Adds cat anomaly detectors API (#52866 )	2020-02-28 12:15:21 -08:00
Benjamin Trent	d7a63333b5	[ML] Add indices_options to datafeed config and update (#52793 ) This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index. This is necessary for the following use cases: - Reading from frozen indices - Allowing certain indices in multiple index patterns to not exist yet These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object. closes https://github.com/elastic/elasticsearch/issues/48056	2020-02-27 12:22:35 -05:00
Lisa Cawley	42fbca7dc6	[DOCS] Adds cat datafeeds API (#52738 )	2020-02-26 09:20:36 -08:00
Lisa Cawley	cd069a861c	[DOCS] Updates custom rules example (#52731 )	2020-02-25 09:30:14 -08:00
David Roberts	ca80ad69f2	[ML] Use event.timezone in file_structure_finder ingest pipeline (#52720 ) This is because beat.timezone was renamed to event.timezone in elastic/beats#9458	2020-02-25 12:18:53 +00:00
lcawl	b590b49205	[DOCS] Adds anchor for custom rules	2020-02-24 10:04:34 -08:00
David Roberts	72346b91f9	[ML] Add new categorization stats to model_size_stats (#51879 ) This change adds support for the following new model_size_stats fields: - categorized_doc_count - total_category_count - frequent_category_count - rare_category_count - dead_category_count - categorization_status Relates #50749	2020-02-06 17:08:43 +00:00
Darren LaCasse	ea67e24b7b	[DOCS] Remove extra word (#51757 )	2020-01-31 10:27:37 -08:00
Lisa Cawley	32adcd2c9d	[DOCS] Adds missing testenv attribute (#51719 )	2020-01-30 16:13:26 -08:00
David Roberts	a5a2e4eaee	[ML] Use CSV ingest processor in find_file_structure ingest pipeline (#51492 ) Changes the find_file_structure response to include a CSV ingest processor in the ingest pipeline it suggests. Previously the Kibana file upload functionality parsed CSV in the browser, but by parsing CSV in the ingest pipeline it makes the Kibana file upload functionality more easily interchangable with Filebeat such that the configurations it creates can more easily be used to import data with the same structure repeatedly in production.	2020-01-28 12:46:00 +00:00
Lisa Cawley	789aeaedab	[DOCS] Updates categorization examples with wizard screenshots (#51133 )	2020-01-22 11:26:10 -08:00
Lisa Cawley	551a83a2ff	[DOCS] Clarify interval, frequency, and bucket span in ML APIs and example (#51280 )	2020-01-22 08:08:31 -08:00
István Zoltán Szabó	087a048ee6	[DOCS] Adds text about data types to the categorization docs (#51145 )	2020-01-17 09:52:57 -08:00
István Zoltán Szabó	406810c172	[DOCS] Describes the relationship of the time-related settings in anomaly detection docs (#50959 ) Co-Authored-By: David Roberts <dave.roberts@elastic.co>	2020-01-15 08:45:03 +01:00
Lisa Cawley	979a28d2b5	[DOCS] Clarify detector_index property in ML APIs (#50723 )	2020-01-09 08:12:53 -08:00
István Zoltán Szabó	659b4ceb97	[DOCS] Improves find_file_structure documentation (#50743 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-01-09 11:19:19 +01:00
István Zoltán Szabó	2f55c3566f	[DOCS] Clarifies model_size_stats.total_xxx_field_count objects and removes notes in GET job stats API docs. (#50728 )	2020-01-09 09:43:55 +01:00
István Zoltán Szabó	d5fcb73b1f	[DOCS] Improves description for forecast_stats (#50729 ) Co-Authored-By: Lisa Cawley <lcawley@elastic.co>	2020-01-09 09:31:30 +01:00
Lisa Cawley	b13a755842	[DOCS] Adds missing timing_stats descriptions (#50574 )	2020-01-03 09:07:08 -08:00
Lisa Cawley	dd4ede5c56	[DOCS] Adds filter and calendar attributes (#50566 )	2020-01-02 10:59:54 -08:00

1 2

93 commits