elasticsearch/docs/reference/ml/anomaly-detection/apis
Benjamin Trent 281ec58b8d
[ML] add new default char filter first_line_with_letters for machine learning categorization (#77457)
The char filter replaces the previous default of `first_non_blank_line`.

`first_non_blank_line` worked well to figure out what line had characters at all, but log lines 
like the following were handled poorly:
```
--------------------------------------------------------------------------------

Alias 'foo' already exists and this prevents setting up ILM for logs

--------------------------------------------------------------------------------
```
When combined with the `ml_standard` tokenizer, the first line was used:
```
--------------------------------------------------------------------------------
```
This has no valid tokens for our standard tokenizer. Consequently, no tokens were found by `ml_standard` tokenizer.


The new filter, `first_line_with_letters`, returns the first line with any letter character (e.g. `Character#isLetter` returns true).

Given the previously poorly handled log, when combining with our `ml_standard` tokenizer, we get the following, more appropriate, tokens:

```
"tokens" : ["Alias", "foo", "already", "exists", "and", "this", "prevents", "setting", "up", "ILM", "for", "logs"]
```
2021-09-09 10:09:57 -04:00
..
close-job.asciidoc [DOCS] Update datafeed details in ML docs (#76854) 2021-08-25 11:35:21 -07:00
delete-calendar-event.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
delete-calendar-job.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
delete-calendar.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
delete-datafeed.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
delete-expired-data.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
delete-filter.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
delete-forecast.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
delete-job.asciidoc [DOCS] Update datafeed details in ML docs (#76854) 2021-08-25 11:35:21 -07:00
delete-snapshot.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
estimate-model-memory.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
flush-job.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
forecast.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
get-bucket.asciidoc [DOCS] Adds defaults to get ML results APIs (#73540) 2021-06-03 10:05:47 -07:00
get-calendar-event.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
get-calendar.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
get-category.asciidoc [DOCS] Adds defaults to get ML results APIs (#73540) 2021-06-03 10:05:47 -07:00
get-datafeed-stats.asciidoc [ML] adding running_state to datafeed stats object (#73926) 2021-06-10 08:08:49 -04:00
get-datafeed.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
get-filter.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
get-influencer.asciidoc [DOCS] Adds defaults to get ML results APIs (#73540) 2021-06-03 10:05:47 -07:00
get-job-stats.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
get-job.asciidoc [DOCS] Fixes nesting of datafeed config in APIs (#75502) 2021-07-20 11:27:15 -07:00
get-ml-info.asciidoc [ML] add new default char filter first_line_with_letters for machine learning categorization (#77457) 2021-09-09 10:09:57 -04:00
get-overall-buckets.asciidoc [DOCS] Adds defaults to get ML results APIs (#73540) 2021-06-03 10:05:47 -07:00
get-record.asciidoc [DOCS] Adds defaults to get ML results APIs (#73540) 2021-06-03 10:05:47 -07:00
get-snapshot.asciidoc [DOCS] Adds peak_model_bytes and assignment_memory_basis to GET model snapshot API docs (#75413) 2021-07-16 17:12:47 +02:00
index.asciidoc [ML] Reset anomaly detection job API (#73908) 2021-06-14 18:56:28 +03:00
ml-apis.asciidoc [ML] Reset anomaly detection job API (#73908) 2021-06-14 18:56:28 +03:00
open-job.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
post-calendar-event.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
post-data.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
preview-datafeed.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
put-calendar-job.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
put-calendar.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
put-datafeed.asciidoc [DOCS] Update datafeed details in ML docs (#76854) 2021-08-25 11:35:21 -07:00
put-filter.asciidoc [DOCS] Fixes links to machine learning concepts (#75194) 2021-07-09 13:09:03 -07:00
put-job.asciidoc [DOCS] Update datafeed details in ML docs (#76854) 2021-08-25 11:35:21 -07:00
reset-job.asciidoc [ML] Reset anomaly detection job API (#73908) 2021-06-14 18:56:28 +03:00
revert-snapshot.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
set-upgrade-mode.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
start-datafeed.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
stop-datafeed.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
update-datafeed.asciidoc [DOCS] Fixes nesting of datafeed config in APIs (#75502) 2021-07-20 11:27:15 -07:00
update-filter.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
update-job.asciidoc [ML] Use results retention time for deleting system annotations (#76096) 2021-08-04 17:42:31 +01:00
update-snapshot.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
upgrade-job-model-snapshot.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
validate-detector.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00
validate-job.asciidoc [DOCS] Revises required privileges info in Anomaly Detection API docs (#72483) 2021-05-03 10:20:14 +02:00