mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-25 07:37:19 -04:00
Categorization of strings which break down to a huge number of tokens can cause the C++ backend process to choke - see elastic/ml-cpp#2403. This PR adds a limit filter to the default categorization analyzer which caps the number of tokens passed to the backend at 100. Unfortunately this isn't a complete panacea to all the issues surrounding categorization of many tokened / large messages as verification checks on the frontend can also fail due to calls to the datafeed _preview API returning an excessive amount of data. |
||
---|---|---|
.. | ||
get-ml-info.asciidoc | ||
get-ml-memory.asciidoc | ||
index.asciidoc | ||
ml-apis.asciidoc | ||
set-upgrade-mode.asciidoc |