* Take match_phrase out of snapshot and make tech preview
* Update docs/changelog/128925.yaml
* PR feedback
* Adding regenerated test data
* Update docs/changelog/128925.yaml
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
* [CI] Auto commit changes from spotless
* Checkstyle
* Correct docs
* Hopefully fix docs build
* Found one more bad docs link - here's hoping this now fixes the doc build
* OMG bitten by - vs _
---------
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Aurélien FOUCRET <aurelien.foucret@gmail.com>
* Initial commit of match_phrase
* Add MatchPhraseQueryTests
* First pass at CSV specs
* Update docs/changelog/127661.yaml
* Refactor so MatchPhrase doesn't use all fulltext test cases, just text only
* Fix tests
* Add some CSV test cases
* Fix test
* Update changelog
* Update tests
* Comment out MATCH_PHRASE in search-functions Markdown
* Minor PR feedback
* PR feedback - refactor/consolidate code
* Add some more tests
* Fix some tests
* [CI] Auto commit changes from spotless
* Fix tests
* PR feedback - add tests, support boost and numeric data
* Revert "PR feedback - add tests, support boost and numeric data"
This reverts commit 4e7a699e3e.
* Apply testing/PR feedback outside numeric support only
* Regenerate docs
* Add negative test
* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
* Update x-pack/plugin/esql/qa/testFixtures/src/main/resources/match-phrase-function.csv-spec
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
* PR feedback
* Fix auto-commit error
* Regenerate docs
* Update x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/MatchPhrase.java
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
* Remove non text field types
* Fake test data
* Remove tests that no longer should pass without ip/date/version support
* Put real data in score tests now that I was able to engineer a failure
* Realized the scoring test might be flakey because how it was written, updated
* PR feedback
* PR feedback
* [CI] Auto commit changes from spotless
* Add check to MatchPhrase tests
* Fix merge errors
* [CI] Auto commit changes from spotless
* Test generated docs
* Add additional verifier tests
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
Added support for the three primary scalar grid functions:
* `ST_GEOHASH(geom, precision)`
* `ST_GEOTILE(geom, precision)`
* `ST_GEOHEX(geom, precision)`
As well as versions of these three that take an optional `geo_shape` boundary (must be a `BBOX` ie. `Rectangle`).
And also supporting conversion functions that convert the grid-id from long to string and back to long.
This work represents the core of the feature to support geo-grid aggregations in ES|QL.
Creates a `ROUND_TO` function that rounds it's input to one of the
provided values. Like so:
```
ROUND_TO(v, 0, 5000, 10000, 20000, 40000, 100000)
v | ROUND_TO
0 | 0
100 | 0
6000 | 5000
45001 | 40000
999999 | 100000
```
For some sequences of numbers you could do this with the `/` operator -
but for arbitrary sequences of numbers you needed `CASE` which is quite
slow. And hard to read!
Rewriting the example above would look like:
```
CASE (
v < 5000, 0,
v < 10000, 5000,
v < 20000, 10000,
v < 40000, 20000,
v < 100000, 40000,
100000
)
```
Even better, this is *fast*:
```
(operation) Mode Cnt Score Error Units
round_to_4_via_case avgt 7 138.124 ± 0.738 ns/op
round_to_4 avgt 7 0.805 ± 0.011 ns/op
round_to_3 avgt 7 0.739 ± 0.011 ns/op
round_to_2 avgt 7 0.651 ± 0.009 ns/op
date_trunc avgt 7 2.425 ± 0.018 ns/op
```
I've included a comparison to `DATE_TRUNC` above because we should be
able to rewrite `DATE_TRUNC` into `ROUND_TO` when we know the date range
of the index. This doesn't do it now, but it should be possible.
Modifies TO_IP so it can handle leading `0`s in ipv4s. Here's how it
works now:
```
ROW ip = TO_IP("192.168.0.1") // OK!
ROW ip = TO_IP("192.168.010.1") // Fails
```
This adds
```
ROW ip = TO_IP("192.168.010.1", {"leading_zeros": "octal"})
ROW ip = TO_IP("192.168.010.1", {"leading_zeros": "decimal"})
```
We do this because there isn't a consensus on how to parse leading zeros
in ipv4s. The standard unix tools like `ping` and `ftp` interpret
leading zeros as octal. Java's built in ip parsing interprets them as
decimal. Because folks are using this for security rules we need to
support all the choices.
Closes#125460
While the internal structure of the docs is already split into many (over 1000) sub-pages, the final display for the `Functions and Operators` page is a single giant page, making navigation harder. This PR splits it into separate pages, one for each group of similar functions and one for the operators. Twelve new pages.
This PR also bundles a few other related changes. In total what is done is:
* Split functions/operators into 12 pages, one for each group, maintaining the existing split of each function/operator into a snippet with dynamically generated examples
* Split esql-commands.md into source-commands.md and processing-commands.md, each of which is split into individual snippets, one for each command
* Each command snippet has it's examples split out into separate files, if they were examples that were dynamically generated in the older asciidoc system
* The examples files are overwritten by the ES|QL unit tests, using a similar mechanism to the examples written for functions and operators)
* Some additional refinements to the Kibana definition and markdown files (nicer operator headings, and display text)
This commit adds a conversion function from numerics (and aggregate
metric doubles) to aggregate metric doubles.
It is most useful when you have multiple indices, where one index uses
aggregate metric double (e.g. a downsampled index) and another uses a
normal numeric type like long or double (e.g. an index prior to
downsampling).
In a few previous PR's we restructured the ES|QL docs to make it possible to generate them dynamically.
This PR just moves a few files around to make the query languages docs easier to work with, and a little more organized like the ES|QL docs.
A bit part of this was setting up redirects to the new locations, so other repo's could correctly link to the elasticsearch docs.
Building on the work started in https://github.com/elastic/elasticsearch/pull/123904, we now want to auto-generate most of the small subfiles from the ES|QL functions unit tests.
This work also investigates any remaining discrepancies between the original asciidoc version and the new markdown, and tries to minimize differences so the docs do not look too different.
The kibana json and markdown files are moved to a new location, and the operator docs are a little more generated than before (although still largely manual).