elasticsearch/docs/reference/aggregations
David Roberts bfccd20155
[ML] Add a regex to the output of the categorize_text aggregation (#90723)
The new `regex` field in `categorize_text` output is created in
the same way as the `regex` field that appears in the category
definitions created by anomaly detection jobs that do categorization.

It consists of the terms that occur in the same order for every
message that matches the category, separated with a `.+?` wildcard.
It therefore matches the category messages and enforces the order
of the terms that occurred in the same order for all messages used
to create the category.

It is not recommended to use the regex as the primary mechanism for
searching for the original documents that were categorized. Search
using a regular expression is very slow. Instead the terms of the
category should be used to search for matching documents, as a
terms search can use the inverted index and hence be much faster.
However, there may be situations where it is useful to use the
`regex` field to test whether a small set of messages that have not
been indexed match the category.
2022-10-10 11:41:16 +01:00
..
bucket [ML] Add a regex to the output of the categorize_text aggregation (#90723) 2022-10-10 11:41:16 +01:00
metrics Centroid aggregation for cartesian points and shapes (#89216) 2022-09-28 17:14:30 +02:00
pipeline REST tests for normalize agg (#89629) 2022-08-26 14:18:46 -04:00
bucket.asciidoc [DOCS] Adds frequent items agg docs (#86037) 2022-05-05 16:07:24 +02:00
metrics.asciidoc Centroid aggregation for cartesian points and shapes (#89216) 2022-09-28 17:14:30 +02:00
pipeline.asciidoc Allow bucket paths to specify _count within a bucket (#85720) 2022-04-29 08:42:46 -04:00