mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 17:34:17 -04:00
Painless: =~ and ==~ operators
Adds support for the find operator (=~) and the match operator (==~) to painless's regexes. Also whitelists most of the Matcher class and documents regex support in painless. The find operator (=~) returns a boolean that is the result of building a matcher on the lhs with the Pattern on the RHS and calling `find` on it. Use it like this: ``` if (ctx._source.last =~ /b/) ``` The match operator (==~) returns boolean like find but instead of calling `find` on the Matcher it calls `matches`. ``` if (ctx._source.last ==~ /[^aeiou].*[aeiou]/) ``` Finally, if you want the actual matcher you do: ``` Matcher m = /[aeiou]/.matcher(ctx._source.last) ```
This commit is contained in:
parent
3c9712794e
commit
8d3ef742db
17 changed files with 947 additions and 700 deletions
|
@ -33,6 +33,8 @@ to `painless`.
|
|||
|
||||
* Shortcuts for list, map access using the dot `.` operator
|
||||
|
||||
* Native support for regular expressions with `/pattern/`, `=~`, and `==~`
|
||||
|
||||
|
||||
[[painless-examples]]
|
||||
[float]
|
||||
|
@ -199,6 +201,79 @@ POST hockey/player/1/_update
|
|||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
[float]
|
||||
=== Regular expressions
|
||||
|
||||
Painless's native support for regular expressions has syntax constructs:
|
||||
|
||||
* `/pattern/`: Pattern literals create patterns. This is the only way to create
|
||||
a pattern in painless.
|
||||
* `=~`: The find operator return a `boolean`, `true` if a subsequence of the
|
||||
text matches, `false` otherwise.
|
||||
* `==~`: The match operator returns a `boolean`, `true` if the text matches,
|
||||
`false` if it doesn't.
|
||||
|
||||
Using the find operator (`=~`) you can update all hockey players with "b" in
|
||||
their last name:
|
||||
|
||||
[source,js]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/player/_update_by_query
|
||||
{
|
||||
"script": {
|
||||
"lang": "painless",
|
||||
"inline": "if (ctx._source.last =~ /b/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}"
|
||||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Using the match operator (`==~`) you can update all the hockey players who's
|
||||
names start with a consonant and end with a vowel:
|
||||
|
||||
[source,js]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/player/_update_by_query
|
||||
{
|
||||
"script": {
|
||||
"lang": "painless",
|
||||
"inline": "if (ctx._source.last ==~ /[^aeiou].*[aeiou]/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}"
|
||||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
Or you can use the `Pattern.matcher` directory to get a `Matcher` instance and
|
||||
remove all of the vowels in all of their names:
|
||||
|
||||
[source,js]
|
||||
----------------------------------------------------------------
|
||||
POST hockey/player/_update_by_query
|
||||
{
|
||||
"script": {
|
||||
"lang": "painless",
|
||||
"inline": "ctx._source.last = /[aeiou]/.matcher(ctx._source.last).replaceAll('')"
|
||||
}
|
||||
}
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
|
||||
Note: all of the `_update_by_query` examples above could really do with a
|
||||
`query` to limit the data that they pull back. While you *could* use a
|
||||
<<query-dsl-script-query>> it wouldn't be as efficient as using any other query
|
||||
because script queries aren't able to use the inverted index to limit the
|
||||
documents that they have to check.
|
||||
|
||||
The pattern syntax is just
|
||||
http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html[Java regular expressions].
|
||||
We intentionally don't allow scripts to call `Pattern.compile` to get a new
|
||||
pattern on the fly because building a `Pattern` is (comparatively) slow.
|
||||
Pattern literals (`/apattern/`) have fancy constant extraction so no matter
|
||||
where they show up in the painless script they are built only when the script
|
||||
is first used. It is fairly similar to how `String` literals work in Java.
|
||||
|
||||
|
||||
[[painless-api]]
|
||||
[float]
|
||||
== Painless API
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue