Commit graph

263 commits

Author SHA1 Message Date
Liam Thompson
cc1ed4f8c4
[DOCS] 8.x mention categorize license requirement (#126670) 2025-04-11 16:01:54 +02:00
Carlos Delgado
a259503b39
ES|QL: Add default values for match function options (#125282) (#125411)
(cherry picked from commit 160ac698d7)

# Conflicts:
#	docs/reference/query-languages/esql/_snippets/functions/functionNamedParams/match.md
2025-03-22 03:01:14 +11:00
Luigi Dell'Aquila
dfc05733a5
ES|QL: fix docs for functions in PREVIEW (#123880) (#124078) 2025-03-05 22:49:22 +11:00
Fang Xing
10579378e5
[ES|QL] Change function_named_parameters in Kibana doc to expected format (#121585) (#121688)
* change function_named_parameters in kibana doc to expected format
2025-02-05 05:34:53 +11:00
Mark Tozzi
1a4ceb4711
[8.x] Esql - Support date nanos in date extract function (#120727) (#120908)
* Esql - Support date nanos in date extract function (#120727)

Resolves https://github.com/elastic/elasticsearch/issues/110000

Add support for running the date extract function on nanosecond dates.

* Fix switch error

* ESQL: Fix DateExtract with nanos tests

---------

Co-authored-by: Iván Cea Fontenla <ivan.cea@elastic.co>
Co-authored-by: Iván Cea Fontenla <ivancea96@outlook.com>
2025-01-29 00:23:44 +11:00
Carlos Delgado
b39edb37a0
ESQL - Add Match function options (#120360) (#120992)
(cherry picked from commit d91d51600e)

# Conflicts:
#	docs/reference/esql/functions/description/match.asciidoc
#	docs/reference/esql/functions/kibana/definition/match.json
#	docs/reference/esql/functions/kibana/docs/match.md
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/Match.java
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/EsqlBaseParser.interp
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/parser/EsqlBaseParser.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
2025-01-28 21:36:26 +11:00
Carlos Delgado
4b86fda751
Match, Like and RLike operators improved docs (#120504) (#120769) 2025-01-24 19:09:38 +11:00
Mark Tozzi
109b6ff8a4
Esql Support date nanos on date diff function (#120645) (#120749)
Resolves #109999

This adds support for date nanos in the date diff function, as well as mixed nanos/millis use cases.

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-24 06:17:11 +11:00
Nik Everett
227f582c07
ESQL: Signatures for NOT IN et al (#120673) (#120737)
* ESQL: Signatures for `NOT IN` et al

This generates signatures for `NOT IN`, `NOT LIKE`, and `NOT RLIKE`
using a small hack on top of the process used to generate the signatures
for `IN`, `LIKE`, and `RLIKE`. This is a very perl-worth hack, replacing
`LIKE` with `NOT LIKE` in the description. But it's useful for our
kibana friends and if we need to make it nicer we can do so later.

* Zap
2025-01-24 04:13:14 +11:00
Mark Tozzi
5af15f42cd
[8.x] ESQL - docs for to_date_nanos (#120124) (#120203)
* ESQL - docs for to_date_nanos (#120124)

I forgot to link the ToDateNanos docs when I merged that function.
---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
 Conflicts:
	docs/reference/esql/functions/description/to_date_nanos.asciidoc
	docs/reference/esql/functions/kibana/definition/to_date_nanos.json
	docs/reference/esql/functions/kibana/docs/to_date_nanos.md
	docs/reference/esql/functions/layout/to_date_nanos.asciidoc

* ESQL - docs for to_date_nanos (#120124)

I forgot to link the ToDateNanos docs when I merged that function.
---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
 Conflicts:
	docs/reference/esql/functions/description/to_date_nanos.asciidoc
	docs/reference/esql/functions/kibana/definition/to_date_nanos.json
	docs/reference/esql/functions/kibana/docs/to_date_nanos.md
	docs/reference/esql/functions/layout/to_date_nanos.asciidoc

* unmute ToDateNanos tests, and build docs
2025-01-22 05:08:55 +11:00
Iván Cea Fontenla
d1f9a0ab01
ESQL: Fix ROUND() with unsigned longs throwing in some edge cases (#119536) (#120381)
There were different error cases with `ROUND(number, decimals)`:
- Decimals accepted unsigned longs, but threw a 500 with a `can't process [unsigned_long -> long]` in the cast evaluator
  - Fixed by improving the `resolveType()`
- If the number was a BigInteger unsigned long, there were 2 cases throwing an exception:
  1. Negative decimals outside the range of integer: Error
  2. Negative decimals insie the range of integer, but "big enough" for `BigInteger.TEN.pow(...)` to throw a `BigInteger would overflow supported range`
  3. -19 decimals with big unsigned longs like `18446744073709551615` was throwing an `unsigned_long overflow`

Also, when the number is a BigInteger and the decimals is a big negative (but not big enough to throw), it may be **very** slow. Taking _many_ seconds for a single computation (It tries to calculate a `10^(big number)`. I didn't do anything here, but I wonder if we should limit it.

To solve most of the cases, a warnExceptions was added for the overflow case, and a guard clause to return 0 for <-19 decimals on unsigned longs.

Another issue is that rounding to a number like 7 to -1 returns 0 instead of 10, which may be considered an error. But it's consistent, so I'm leaving it to another PR
2025-01-18 04:07:50 +11:00
Nik Everett
9ab6a72979
Add operator to ESQL signature for kibana (#120230) (#120325)
This adds a field to the kibana defintion files for each signature that
looks like:
```
  "operator": "+",
```
Kibana wants these symbols.
2025-01-17 11:31:08 -05:00
Nik Everett
8253a834c2
ESQL: Move more test type error testing (#119945) (#120324)
This reduces the number of test cases in ESQL a little more ala #119678.
It migrates a few random tests and all of the multivalue functions:
```
92775 -> 43760
 3m45 -> 4m04
```

This adds a few more error test cases that were missing to make sure it all
lines up well. And it fixes a few error messages in a few functions. That's
*likely* where the extra time goes.
2025-01-17 08:47:08 +11:00
Mark Tozzi
68de069291
Esql - support date nanos in date format function (#120143) (#120218)
This adds support for passing Date Nanos into the Date Format function. It works for both the single argument and two argument versions. Format strings are unchanged, as the same formatting logic works for both resolutions.

resolves #109994

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-16 04:54:38 +11:00
Nik Everett
a61670ea7f
ESQL: Limit memory usage of fold (#118602) (#120100)
`fold` can be surprisingly heavy! The maximally efficient/paranoid thing
would be to fold each expression one time, in the constant folding rule,
and then store the result as a `Literal`. But this PR doesn't do that
because it's a big change. Instead, it creates the infrastructure for
tracking memory usage for folding as plugs it into as many places as
possible. That's not perfect, but it's better.

This infrastructure limit the allocations of fold similar to the
`CircuitBreaker` infrastructure we use for values, but it's different
in a critical way: you don't manually free any of the values. This is
important because the plan itself isn't `Releasable`, which is required
when using a real CircuitBreaker. We could have tried to make the plan
releasable, but that'd be a huge change.

Right now there's a single limit of 5% of heap per query. We create the
limit at the start of query planning and use it throughout planning.

There are about 40 places that don't yet use it. We should get them
plugged in as quick as we can manage. After that, we should look to the
maximally efficient/paranoid thing that I mentioned about waiting for
constant folding. That's an even bigger change, one I'm not equipped
to make on my own.
2025-01-15 14:33:58 +01:00
Mark Tozzi
7603eded80
[8.x] ESQL Support IN operator for Date nanos (#119772) (#120126)
* ESQL Support IN operator for Date nanos (#119772)

Add support for using nanosecond dates with the IN operator. This behavior should be consistent with equals, and support comparisons between milliseconds and nanoseconds the same as the binary comparison operators support it.

Resolves #118578

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>

* remove use of future java functions

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-15 03:52:02 +11:00
Mark Tozzi
02835dcf28
Esql additional date format testing (#120000) (#120056)
This wires up the randomized testing for DateFormat. Prior to this PR, none of the randomized testing was hitting the one parameter version of the function, so I wired that up as well. This required some compromises on the type signatures, see comments in line.less

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-14 02:21:27 +11:00
Carlos Delgado
faf566577e
ESQL - Update QSTR docs (#120026) 2025-01-13 08:44:36 +00:00
Aurélien FOUCRET
2daacef577
[8.x] [ES|QL] Enable KQL function as a tech preview (#119730) (#119954)
* [ES|QL] Enable KQL function as a tech preview (#119730)

(cherry picked from commit 31f11c3c0c)

# Conflicts:
#	docs/reference/esql/esql-limitations.asciidoc
#	server/src/main/java/org/elasticsearch/TransportVersions.java
#	server/src/main/java/org/elasticsearch/rest/action/search/SearchCapabilities.java

* Update server/src/main/java/org/elasticsearch/TransportVersions.java
2025-01-11 03:35:35 +11:00
Ievgen Degtiarenko
b398448848
Hash functions (#118938) (#119769)
This change adds md5, sha1 and sha256 hash functions.
2025-01-09 19:39:51 +11:00
Bogdan Pintea
2383df6e10
ESQL: Docs: add example of date bucketing with offset (#116680) (#118985)
Add an example of how to create date histograms with an offset.

Fixes #114167
2024-12-19 04:27:32 +11:00
Ievgen Degtiarenko
c34e8e2e3d
ESQL Add esql hash function (#117989) (#118927)
This change introduces esql hash(alg, input) function that relies on the Java MessageDigest to compute the hash.
2024-12-18 22:46:44 +11:00
Gal Lalouche
905f9f4692
ESQL: Support ST_EXTENT_AGG (#117451) (#118829)
This PR adds support for ST_EXTENT_AGG aggregation, i.e., computing a bounding box over a set of points/shapes (Cartesian or geo). Note the difference between this aggregation and the already implemented scalar function ST_EXTENT.

This isn't a very efficient implementation, and future PRs will attempt to read these extents directly from the doc values.
We currently always use longitude wrapping, i.e., we may wrap around the dateline for a smaller bounding box. Future PRs will let the user control this behavior.
Fixes #104659.
2024-12-17 23:14:03 +11:00
Mark Tozzi
9166cd8d37
Esql bucket function for date nanos (#118474) (#118670)
This adds support for running the bucket function over a date nanos field. Code wise, this just delegates to DateTrunc, which already supports date nanos, so most of the PR is just tests and the auto-generated docs.

Resolves #118031
2024-12-17 01:25:37 +11:00
Craig Taverner
d6c14a2b8a
[8.x] Support ST_ENVELOPE and related ST_XMIN, etc. (#116964) (#118743)
* Support ST_ENVELOPE and related ST_XMIN, etc. (#116964)

Support ST_ENVELOPE and related ST_XMIN, etc.

Based on the PostGIS equivalents:

https://postgis.net/docs/ST_Envelope.html
https://postgis.net/docs/ST_XMin.html
https://postgis.net/docs/ST_XMax.html
https://postgis.net/docs/ST_YMin.html
https://postgis.net/docs/ST_YMax.html

* Fix off-by-one error reported in #118051
2024-12-16 21:53:39 +11:00
Carlos Delgado
0e4de18af6
[8.x] ESQL: Expand type compatibility for match function and operator (#117555) (#118297)
* Fix and unmute synonyms tests using timeout (#117486)

(cherry picked from commit 930a99cc38)

# Conflicts:
#	muted-tests.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/10_synonyms_put.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/110_synonyms_invalid.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/20_synonyms_get.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/30_synonyms_delete.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/40_synonyms_sets_get.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/50_synonym_rule_put.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/60_synonym_rule_get.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/70_synonym_rule_delete.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/80_synonyms_from_index.yml
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/synonyms/90_synonyms_reloading_for_synset.yml

* Fix merge
2024-12-11 00:43:47 +11:00
Mark Tozzi
7f1d7c4173
Esql compare nanos and millis (#118027) (#118159)
Resolves #116281

Introduces support for comparing millisecond dates with nanosecond dates, without the need for casting. Millisecond dates outside of the nanosecond date range are handled correctly.
2024-12-07 02:22:56 +11:00
Tommaso Teofili
af84c6142e
Backport Term query for ES|QL to 8.x (#117359) (#118135)
* Term query for ES|QL (#117359)

This commit adds a `term` function for ES|QL to run `TermQueries`.

For example:
FROM test | WHERE term(content, "dog")

(cherry picked from commit 91605860ee)

* Update docs/changelog/118135.yaml
2024-12-06 11:40:38 +01:00
Mark Tozzi
b931c7c798
ESQL Date Nanos Addition and Subtraction (#116839) (#117848)
Resolves #109995

This adds support and tests for addition and subtraction of date nanos with periods and durations. It does not include support for date_diff, which is a separate ticket (#109999). The bulk of the PR is testing, the actual date math is all handled by library functions.

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-12-05 12:39:36 +11:00
Jan Kuipers
6375bb02a6
Document ES|QL categorize limitations (#117892) (#117965)
* Document ES|QL categorize limitations

* Update x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/Categorize.java



---------

Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
2024-12-04 21:05:11 +11:00
Jan Kuipers
5b220ddc88
ES|QL categorize docs (#117827) (#117838)
* Move ES|QL categorize out of snapshot functions

* Categorize docs

* Add experimental + fix docs

* Add experimental + fix docs
2024-12-03 03:51:11 +11:00
Iván Cea Fontenla
d90b4c7a9a
Backport 9022ccc (#117699) 2024-11-28 22:48:14 +11:00
Aurélien FOUCRET
891d3e53f9
[8.x] ES|QL kql function. (#116764) (#117474)
* ES|QL kql function. (#116764)

* Fix test compile error in branch 8.x
2024-11-26 05:11:23 +11:00
Craig Taverner
5b21640b33
Make spatial search functions not preview (#117489) (#117498) 2024-11-26 03:30:20 +11:00
Larisa Motova
04849b0fd8
[ES|QL] Add a standard deviation function (#116531) (#117398)
Uses Welford's online algorithm, as well as the parallel version, to
calculate standard deviation.
2024-11-23 10:44:28 +11:00
Nik Everett
20e02fab75
ESQL: Add docs for MV_PERCENTILE (#117377) (#117381)
We built this a while back. Let's document it.
2024-11-23 07:06:31 +11:00
Nik Everett
d95c003e61
ESQL: Make WEIGHTED_AVG not preview (#117356) (#117362)
It's not PREVIEW.
2024-11-23 03:53:36 +11:00
Luigi Dell'Aquila
acdd6ec418
ES|QL: fix validation of SORT by aggregate functions (#117316) (#117326) 2024-11-22 23:17:16 +11:00
Carlos Delgado
5a9c05ea1d
ESQL - match operator included in non-snapshot builds (#116819) (#117224) 2024-11-21 09:20:55 +01:00
Mark Tozzi
cafa440771
[8.x] Esql Enable Date Nanos (#117080) (#117161)
* Esql Enable Date Nanos (#117080)

This enables date nanos support as tech preview. Basic operations, like reading values, binary comparisons, and functions that don't care about type should work, but some functions are not yet supported. Most notably, Bucket is not yet supported, although Date_Trunc is and can be used for grouping. See the docs for the full list of limitations.

relates to #109352

* Skip CATEGORIZE tests outside snapshot

---------

Co-authored-by: Nik Everett <nik9000@gmail.com>
2024-11-21 08:16:59 +11:00
Fang Xing
df1130f4b2
[ES|QL][DOCS] Add docs for date_period and time_duration (#116368) (#117021)
* add docs for date_period and time_duration
2024-11-20 00:14:46 +11:00
Bogdan Pintea
33dfe554e7
ESQL: Docs: COUNT: add an explanation to the use of the 3VL (#116684) (#117006)
Add an explanation of why `... OR NULL` is needed with `COUNT(...)`.

Fixes: #99954
2024-11-19 21:43:31 +11:00
Gal Lalouche
b3edb3a6a4
[ESQL] Update docs format (missing space before '=') (#116808) (#116816) 2024-11-15 02:04:30 +11:00
Gal Lalouche
dae79b5c22
[8.x] [ESQL] Add support BYTE_LENGTH scalar function (#116591) (#116731) 2024-11-14 14:40:42 +01:00
Tim Grein
b7951c5ce7
Add ES|QL bit_length function (#115792) (#116378) 2024-11-07 20:04:20 +11:00
Mark Tozzi
1224db91d5
[ESQL] clean up date trunc tests (#116111) (#116179)
While working on #110008 I discovered that the Date Trunc tests were only running in folding mode, because the interval types are marked as not representable. The correct way to test this is to set the forceLiteral flag for those fields, which will (as the name suggests) force them to be literals even in non-folding tests.

Doing that turned up errors in the evaluatorToString tests, which I fixed. There are two big changes here. First, the second parameter to the evaluator is a Rounding instance, not the actual interval. Since Rounding includes some information about the specific rounding in the toString results, I am just using a starts with matcher to validate the majority of the string, rather than trying to reconstruct the expected rounding string. Second, passing in a literal null for the interval parameter folds the whole expression to null, and thus a completely different toString. I added a clause in AnyNullIsNull to account for this.

While I was in there, I moved some specific test cases to a different file. I know moving code is something we're trying to minimize right now, but this seemed worth it. The tests in question do not depend on the parameters of the test case, but all methods in the class get run for every set of parameters. This was causing these tests to be run many times with the same values, which bloats our test run time and test count. Moving them to a distinct class means they'll only be executed once per test run. I feel like this benefit outweighs the cost of git history complexity.
2024-11-05 02:32:08 +11:00
Chris Hegarty
78fc557d3f ES|QL Add full-text search to the functions docs page (#116024)
Now that the match and qstr functions are Tech Previewing, we should add them to the top-level functions doc page.

Co-authored-by: Craig Taverner <craig@amanzi.com>
2024-11-01 12:08:48 +00:00
Craig Taverner
3b3e7f7484
Don't return TEXT type for functions that take TEXT (#114334) (#115625)
Always return `KEYWORD` for functions that previously returned `TEXT`, because any change to the value, no matter how small, is enough to render meaningless the original analyzer associated with the `TEXT` field value. In principle, if the attribute is no longer the original `FieldAttribute`, it can no longer claim to have the type `TEXT`.

This has been done for all functions: conversion functions, aggregating functions, multi-value functions. There were several that already produced `KEYWORD` for `TEXT` input (eg. ToString, FromBase64 and ToBase64, MvZip, ToLower, ToUpper, DateFormat, Concat, Left, Repeat, Replace, Right, Split, Substring), but many others that incorrectly claimed to produce `TEXT`, while this was really a false claim. This PR makes that now strict, and includes changes to the functions' units tests to disallow the tests to expect any functions output to be `TEXT`.

One side effect of this change is that methods that take multiple parameters that require all of them to have the same type, will now treat TEXT and KEYWORD the same. This was already the case for functions like `Concat`, but is now also the case for `Greatest`, `Least`, `Case`, `Coalesce` and `MvAppend`.

An associated change is that the type casting operator `::text` has been entirely removed. It used to map onto the `ToString` function which returned type KEYWORD, and so `::text` really produced a `KEYWORD`, which is a lie, or at least a `bug`, which is now fixed. Should we ever wish to actually produce real `TEXT`, we might love the fact that this operator has been freed up for future use (although it seems likely that function will require parameters to specify the analyzer, so might never be an operator again).

### Backwards compatibility issues:

This is a change that will fail BWC tests, since we have many tests that assert on TEXT output to functions. For this reason we needed to block two scenarios:

* We used the capability `functions_never_emit_text` to prevent 7 csv-spec tests and 2 yaml tests from being run against older versions that still emit text.
* We used `skipTest` to also block those two yaml tests from being run against the latest build, but using older yaml files downloaded (as far back as 8.14).

In all cases the change observed in these tests was simply the results columns no longer having `text` type, and instead being `keyword`.

---------

Co-authored-by: Luigi Dell'Aquila <luigi.dellaquila@gmail.com>
2024-10-25 20:12:02 +11:00
Luigi Dell'Aquila
5290630bd0
ES|QL: improve docs about escaping for GROK, DISSECT, LIKE, RLIKE (#115320) (#115493) 2024-10-24 19:14:57 +11:00
Nik Everett
f38f2301bc
ESQL: Skip unsupported grapheme cluster test (#115258)
This skips the test for reversing grapheme clusters if the node doesn't
support reversing grapheme clusters. Nodes that are using a jdk before
20 won't support reversing grapheme clusters because they don't have
https://bugs.openjdk.org/browse/JDK-8292387

This reworks `EsqlCapabilities` so we can easilly register it only if
we're on jdk 20:
```
FN_REVERSE_GRAPHEME_CLUSTERS(Runtime.version().feature() < 20),
```

Closes #114537
Closes #114535
Closes #114536
Closes #114558
Closes #114559
Closes #114560
2024-10-21 20:06:56 +02:00