elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-30 10:23:41 -04:00

Author	SHA1	Message	Date
James Rodewig	f56a0f4b66	[DOCS] Remove `testenv` annotations from doc snippet tests (#80023 ) Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible. Relates to #79309, #31619	2021-11-05 18:38:50 -04:00
Benjamin Trent	281ec58b8d	[ML] add new default char filter `first_line_with_letters` for machine learning categorization (#77457 ) The char filter replaces the previous default of `first_non_blank_line`. `first_non_blank_line` worked well to figure out what line had characters at all, but log lines like the following were handled poorly: ``` -------------------------------------------------------------------------------- Alias 'foo' already exists and this prevents setting up ILM for logs -------------------------------------------------------------------------------- ``` When combined with the `ml_standard` tokenizer, the first line was used: ``` -------------------------------------------------------------------------------- ``` This has no valid tokens for our standard tokenizer. Consequently, no tokens were found by `ml_standard` tokenizer. The new filter, `first_line_with_letters`, returns the first line with any letter character (e.g. `Character#isLetter` returns true). Given the previously poorly handled log, when combining with our `ml_standard` tokenizer, we get the following, more appropriate, tokens: ``` "tokens" : ["Alias", "foo", "already", "exists", "and", "this", "prevents", "setting", "up", "ILM", "for", "logs"] ```	2021-09-09 10:09:57 -04:00
David Roberts	0059c59e25	[ML] Make ml_standard tokenizer the default for new categorization jobs (#72805 ) Categorization jobs created once the entire cluster is upgraded to version 7.14 or higher will default to using the new ml_standard tokenizer rather than the previous default of the ml_classic tokenizer, and will incorporate the new first_non_blank_line char filter so that categorization is based purely on the first non-blank line of each message. The difference between the ml_classic and ml_standard tokenizers is that ml_classic splits on slashes and colons, so creates multiple tokens from URLs and filesystem paths, whereas ml_standard attempts to keep URLs, email addresses and filesystem paths as single tokens. It is still possible to config the ml_classic tokenizer if you prefer: just provide a categorization_analyzer within your analysis_config and whichever tokenizer you choose (which could be ml_classic or any other Elasticsearch tokenizer) will be used. To opt out of using first_non_blank_line as a default char filter, you must explicitly specify a categorization_analyzer that does not include it. If no categorization_analyzer is specified but categorization_filters are specified then the categorization filters are converted to char filters applied that are applied after first_non_blank_line. Closes elastic/ml-cpp#1724	2021-06-01 15:11:32 +01:00
István Zoltán Szabó	d07c174aaf	[DOCS] Revises required privileges info in Anomaly Detection API docs (#72483 )	2021-05-03 10:20:14 +02:00
James Rodewig	693807a6d3	[DOCS] Fix double spaces (#71082 )	2021-03-31 09:57:47 -04:00
David Roberts	e4ce39845b	[ML] Add total ML memory to ML info (#65195 ) This change adds an extra piece of information, limits.total_ml_memory, to the ML info response. This returns the total amount of memory that ML is permitted to use for native processes across all ML nodes in the cluster. Some of this may already be in use; the value returned is total, not available ML memory.	2020-11-18 15:06:21 +00:00
Lisa Cawley	1781d4a7b9	[DOCS] Fix security links in machine learning APIs (#60098 )	2020-07-23 12:14:56 -07:00
Lisa Cawley	823c337e76	[DOCS] Changes level offset for anomaly detection APIs (#59920 )	2020-07-20 12:38:09 -07:00
David Roberts	c99021cdcb	[ML] More advanced model snapshot retention options (#56125 ) This PR implements the following changes to make ML model snapshot retention more flexible in advance of adding a UI for the feature in an upcoming release. - The default for `model_snapshot_retention_days` for new jobs is now 10 instead of 1 - There is a new job setting, `daily_model_snapshot_retention_after_days`, that defaults to 1 for new jobs and `model_snapshot_retention_days` for pre-7.8 jobs - For days that are older than `model_snapshot_retention_days`, all model snapshots are deleted as before - For days that are in between `daily_model_snapshot_retention_after_days` and `model_snapshot_retention_days` all but the first model snapshot for that day are deleted - The `retain` setting of model snapshots is still respected to allow selected model snapshots to be retained indefinitely Closes #52150	2020-05-05 12:55:50 +01:00
David Roberts	d1a9b3a545	[ML] Add effective max model memory limit to ML info (#55529 ) The ML info endpoint returns the max_model_memory_limit setting if one is configured. However, it is still possible to create a job that cannot run anywhere in the current cluster because no node in the cluster has enough memory to accommodate it. This change adds an extra piece of information, limits.effective_max_model_memory_limit, to the ML info response that returns the biggest model memory limit that could be run in the current cluster assuming no other jobs were running. The idea is that the ML UI will be able to warn users who try to create jobs with higher model memory limits that their jobs will not be able to start unless they add a bigger ML node to their cluster. Relates elastic/kibana#63942	2020-04-22 11:36:58 +01:00
David Roberts	40c951d781	[ML] Add default categorization analyzer definition to ML info (#49545 ) The categorization job wizard in the ML UI will use this information when showing the effect of the chosen categorization analyzer on a sample of input.	2019-11-25 13:20:12 +00:00
Lisa Cawley	4e4990c6a0	[DOCS] Cleans up links to security content (#47610 )	2019-10-04 16:10:26 -07:00
James Rodewig	e43be90e6c	[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449 )	2019-09-06 14:05:36 -04:00
James Rodewig	97802d8aff	[DOCS] Change // CONSOLE comments to [source,console] (#46441 )	2019-09-06 10:55:16 -04:00
Lisa Cawley	4fd8e34662	[DOCS] Moves content to ML anomaly-detection folder (#44520 )	2019-07-17 13:48:12 -07:00

15 commits