mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 09:28:55 -04:00
Provide access to new settings for HyphenationCompoundWordTokenFilter (#115585)
Allow the new flags added in Lucene in the HyphenationCompoundWordTokenFilter Adds access to the two new flags no_sub_matches and no_overlapping_matches. Lucene issue: https://github.com/apache/lucene/issues/9231
This commit is contained in:
parent
99689281e0
commit
c804953105
7 changed files with 1295 additions and 11 deletions
|
@ -111,6 +111,18 @@ output. Defaults to `5`.
|
|||
(Optional, Boolean)
|
||||
If `true`, only include the longest matching subword. Defaults to `false`.
|
||||
|
||||
`no_sub_matches`::
|
||||
(Optional, Boolean)
|
||||
If `true`, do not match sub tokens in tokens that are in the word list.
|
||||
Defaults to `false`.
|
||||
|
||||
`no_overlapping_matches`::
|
||||
(Optional, Boolean)
|
||||
If `true`, do not allow overlapping tokens.
|
||||
Defaults to `false`.
|
||||
|
||||
Typically users will only want to include one of the three flags as enabling `no_overlapping_matches` is the most restrictive and `no_sub_matches` is more restrictive than `only_longest_match`. When enabling a more restrictive option the state of the less restrictive does not have any effect.
|
||||
|
||||
[[analysis-hyp-decomp-tokenfilter-customize]]
|
||||
==== Customize and add to an analyzer
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue