elasticsearch/docs/reference/ml/common/apis
Ed Savage fd20027751
[ML] Performance improvements for categorization jobs (#89824)
Categorization of strings which break down to a huge number of tokens can cause the C++ backend process to choke - see elastic/ml-cpp#2403.

This PR adds a limit filter to the default categorization analyzer which caps the number of tokens passed to the backend at 100.

Unfortunately this isn't a complete panacea to all the issues surrounding categorization of many tokened / large messages as verification checks on the frontend can also fail due to calls to the datafeed _preview API returning an excessive amount of data.
2022-09-08 18:41:01 +01:00
..
get-ml-info.asciidoc [ML] Performance improvements for categorization jobs (#89824) 2022-09-08 18:41:01 +01:00
get-ml-memory.asciidoc [ML] Add ML memory stats API (#83802) 2022-02-17 09:19:14 +00:00
index.asciidoc [ML] Add ML memory stats API (#83802) 2022-02-17 09:19:14 +00:00
ml-apis.asciidoc [ML] Add ML memory stats API (#83802) 2022-02-17 09:19:14 +00:00
set-upgrade-mode.asciidoc [DOCS] Move ML info and upgrade APIs (#84005) 2022-02-16 11:23:00 -08:00