Commit graph

155 commits

Author SHA1 Message Date
Luigi Dell'Aquila
a5b1848c14
ES|QL: more tests for coalesce() function (#109032)
Adding more unit tests for `coalesce()` function, in particular adding
tests for `ip`, `date` and spatial data types.

This also generates the right signatures for Kibana.

Related to https://github.com/elastic/elasticsearch/issues/108982
2024-05-27 04:36:06 -04:00
Alexander Spies
16a5d248b7
ESQL: Clone ql for esql (#108773)
Part of https://github.com/elastic/elasticsearch/issues/106679

* Copy the `ql` project into a different project _just for esql_, call it `esql-core`.
* Make `esql` depend only on the latter.
* Fix `EsqlNodeSubclassTests`; I'm confused why this didn't bite us earlier.
* Update the warning regexes in some csv tests as the exceptions have other package names now.

**Note to reviewers:** Exclude the first commit when viewing the diff,
as that contains only the actual copying of `ql`. The remaining commits
are the actually meaningful ones. _The `build.gradle` files probably
require the most attention._
2024-05-22 04:35:17 -04:00
Iván Cea Fontenla
62b372b4dc
ESQL: CBRT function (#108574)
- Added the cube root function to ESQL (`CBRT(x)`). Nearly identical to SQRT, but without the negative numbers exception
- Added docs generation support for Windows end lines (CRLF), as within the examples, it was writing the "\r" without the "\n" (Which was being converted to "\\n"), and some other inconsistencies
- Some updates to `package-info.java` documentation over how to create functions
- Fixes https://github.com/elastic/elasticsearch/issues/108675

Functions issue: https://github.com/elastic/elasticsearch/issues/98545
2024-05-15 16:50:15 +02:00
Fang Xing
172c05918c
[DOCS] ES|QL implicit casting (#108618)
* implicit casting doc
2024-05-15 09:07:09 -04:00
Fang Xing
11de886346
[ES|QL] Add/Modify annotations for spatial and conditional functions for better doc generation (#107722)
* annotation for spatial functions and conditional functions
2024-05-10 14:49:25 -04:00
Luigi Dell'Aquila
fed808850d
ES|QL: Add unit tests for now() function (#108498) 2024-05-10 14:28:19 +02:00
Bogdan Pintea
de725aef80
Add docs clarifications on DATE_DIFF args (#108301)
This adds some clarifications on the time unit strings the function
takes as arguments, noting the differences between these and the time
span literals, as well as the abbreviations' source.
2024-05-07 12:59:01 +02:00
Bogdan Pintea
b26d7d3e14
Introduce an IP functions group (#108304)
This takes the CIDR_MATCH out of the operators group and adds it to a
new `IP functions` group.
The change also re-aranges the groups, grouping together the
type-specific functions and ordering them alphabetically.
2024-05-06 13:43:30 +02:00
Fang Xing
4daac77e3b
[ES|QL] Add/Modify annotations for operators for better doc generation (#108220)
* annotation for operators
2024-05-03 22:59:51 -04:00
Bogdan Pintea
5f4ef87c47
Fix docs generation of signatures for variadic functions (#107865)
This fixes the generation of the signatures for variadic functions,
except for those that take a list as last argument; i.e.  functions with
optional arguments (like ROUND) or functions with overloading-like
signatures (like BUCKET).
2024-05-03 15:37:22 +02:00
Fang Xing
7ae08306a0
mv functions (#107839)
Add annotations for MV functions for better doc generation.
2024-05-01 10:47:22 -04:00
Bogdan Pintea
4b5c5e2ded
Update BUCKET docs in source (#108005)
This applies a review proposed changes to the source, so that they're
synchronized to the generated output.
2024-04-29 14:27:20 +02:00
Nhat Nguyen
22aad7b201
Support metrics counter types in ESQL (#107877)
This commit adds support for numeric metrics counter fields in ES|QL. 
These counter types, including counter_long, counter_integer, and
counter_double, are different from their parent types. Users will have
limited interaction with these counter types, restricted to:

- Retrieving values without any processing
- Casting to their root type (e.g., to_long(a_long_counter))
- Using them in the metrics rate aggregation

These restrictions are intentional to prevent misuse. If users want to 
use them as numeric values, explicit casting to their root types is
required.
2024-04-26 12:15:48 -07:00
Bogdan Pintea
a21242054b
ESQL: Document BUCKET as a grouping function (#107864)
This adds the documentation for BUCKET as a grouping function and the
addition of the "direct" invocation mode providing a span (in addition
to the auto mode).
2024-04-25 12:38:12 -04:00
Bogdan Pintea
7af45cc52e
ESQL: Document the cast operator (::) (#107871)
This documents the cast operator, `::`.
2024-04-25 10:10:59 -04:00
Bogdan Pintea
31f2fb85df
Docs: move STARTS/ENDS_WITH under string functions in the docs (#107867)
This moves the STARTS_WITH and ENDS_with under the strings functions
section (as they're not operators).
2024-04-25 09:41:11 -04:00
Bogdan Pintea
9482673fbe
Docs: move base64 functions under string functions (#107866)
This moves the TO_BASE64 and FROM_BASE64 from the type conversion
functions under string functions (they take a string as input and output
another string).
2024-04-25 13:57:45 +02:00
Fang Xing
ad15d50863
[ES|QL] more doc generation via annotations (#107541)
Annotations for math functions, datetime functions, string functions, type conversion functions.
2024-04-22 14:43:36 -04:00
Mark Tozzi
f620961812
[ESQL] Add in the autogenerated docs for a bunch of functions (#107633) 2024-04-18 14:09:30 -04:00
Bogdan Pintea
a2c2e8fe47
ESQL: extend BUCKET with spans. Turn it into a grouping function (#107272)
This extends `BUCKET` function to accept a two-parameters-only
invocation: the first parameter remains as is, while the second is a
span. It can be a numeric (floating point) span, if the first argument
is numeric, or a date period or time duration, if the first argument is
a date.

Also, the function can now be invoked with the alias BIN.

Additionally, the function has been turned into a grouping-only function
and thus can only be used within a `STATS` command.
2024-04-16 12:57:18 +02:00
Fang Xing
353abef214
[ES|QL] Base64 decoding and encoding functions (#107390)
* add base64 functions
2024-04-15 18:39:26 -04:00
Nik Everett
aac17616a3
ESQL: Improve tests and docs for some functions (#107331)
This improves the tests and docs for a few functions, specifically `E`,
`FLOOR`, `PI`, `POW`, and `ROUND`. The examples and tested signatures
will get copied into the docs and kibana signatures.


Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-04-11 12:41:56 -04:00
Fang Xing
0075c1fb1e
[ES|QL] String literal implicit casting (#106932)
* string literal casting for scalar functions and arithmetic operations.
2024-04-10 21:20:12 -04:00
Craig Taverner
d915b964ba
Rename ST_CENTROID to ST_CENTROID_AGG (#107226)
* Rename ST_CENTROID to ST_CENTROID_AGG

In order to allow development of a scalar ST_CENTROID function.

* Fix table alignment
2024-04-10 17:56:45 +02:00
Liam Thompson
943885d0cd
[DOCS][ESQL] Render locate function docs (#107305) 2024-04-10 15:12:20 +02:00
Bogdan Pintea
8bcbc97128
Rename generated docs for (renamed) BUCKET func (#107299)
This checks in the generated-by-test doc files for newly renamed BUCKET
function.
2024-04-10 06:50:12 -04:00
Bogdan Pintea
d6f9d1e69e
ESQL: Rename AUTO_BUCKET to just BUCKET (#107197)
This renames the function AUTO_BUCKET to just BUCKET.
It also removes the experimental tagging of the function in the docs, making it generally available.
2024-04-10 12:21:08 +02:00
Nik Everett
96227a1970
ESQL: Generate kibana inline docs (#106782)
This takes a stab at generating the markdown files that Kibana uses for
its inline help. It doesn't include all of the examples because the
`@Example` annotation is not filled in - we're tracking that in
https://github.com/elastic/elasticsearch/issues/104247#issuecomment-2018944371

There are some links in the output and they are in markdown syntax. We
should figure out how to make them work for kibana.
2024-04-09 14:19:48 -04:00
Nik Everett
aba7566409
ESQL: Better tests to AUTO_BUCKET (#107228)
This improves the tests for AUTO_BUCKET marginally, specifically so that
it tests all valid combinations of arguments and generates a correct
types table. This'll combine nicely with #106782 to generate the
signatures that kibana needs for it's editor.
2024-04-09 12:22:15 -04:00
Luigi Dell'Aquila
2588c72a52
ES|QL: Add unit tests and docs for DATE_TRUNC() (#107145) 2024-04-09 10:41:34 +02:00
Craig Taverner
a7b38394d9
ESQL: Support ST_DISJOINT (#107007)
* WIP Started developing ST_DISJOINT

Initially based on ST_INTERSECTS

* Fix functions list and add spatial point integration tests

* Update docs/changelog/107007.yaml

* More tests for shapes and cartesian-multigeoms

* Some more tests to highlight issues with DISJOINT on cartesian point indices

* Disable Lucene push-down for DISJOINT on cartesian point indices

* Added docs for ST_DISJOINT

* Support DISJOINT in the lucene-pushdown code for cartesian point indexes

* Re-enable push-to-source for DISJOINT on cartesian_point indices

* Fix docs example

* Try fix internal docs links which are not being rendered

* Fixed disjoint on empty geometry

* Added tests on empty linestring, and changed lucene push-down to exception

In lucene code only LineString can be empty, but in Elasticsearch even that is not allowed, resulting in parsing errors. So we cannot get to this code in the lucene push-down and now throw an error instead. The tests now assert on the warnings.

Note that for any predicate DISJOINT and INTERSECTS alike, the predicate fails, because the parsing error results in null, the function returns null, the predicate interprets this as false, and no documents match. This null-in-null-out rule means that DISJOINT and INTERSECTS give the same answer on invalid geometries.
2024-04-08 12:26:26 +02:00
Tommaso Teofili
54eeb622d5
Add ES|QL Locate function (#106899)
* Add ES|QL Locate function
2024-04-05 15:29:54 +02:00
Ioana Tagirta
7b254218fb
Add ES|QL signum function (#106866)
* Add ES|QL signum function

* Update docs/changelog/106866.yaml

* Skip csv tests for versions older than 8.14

* Reference layout docs file and fix instructions for adding functions

* Break csv specs by param type

* More tests
2024-04-04 09:48:35 +02:00
Nik Everett
b97e2d61fb
ESQL: Fixup docs for LOG and LOG10 (#106963)
This merges all of the hand written docs for `LOG` and `LOG10` into the
annotations which updates the `META FUNCTIONS` - now it'll always be the
same as the docs. This also deletes the hand maintained docs and let's
the documentation generation process rebuild it.
2024-04-03 09:46:32 -04:00
Nik Everett
c74490c137
ESQL: Enable VALUES agg for datetime (#107016)
When I wrote the `VALUES` agg I didn't plug it in for `datetime` fields.
Ooops. We just have to plug it in.
2024-04-03 07:42:40 -04:00
Craig Taverner
2380492fac
ESQL: Support ST_CONTAINS and ST_WITHIN (#106503)
* WIP Started adding ST_CONTAINS

* Add generated evaluators

* Reduced warnings and use correct evaluators

* Refactored tests to remove duplicate code, and fixed Contains/multi-components

* Gradle build disallows using getDeclaredField

* Fixed cases where rectangles cross the dateline

* Fixed meta function tests

* Added ST_WITHIN to support inverting ST_CONTAINS

If the ST_CONTAINS is called with the constant on the left, we either have to create a lot more Evaluators to cover that case, or we have to invert it to ST_WITHIN. This inversion was a much easier option.

* Simplify inversion logic

* Add comment on choice of surrogate approach

* Add unit tests and missing fold() function

* Simple code cleanup

* Add integration tests for literals

* Add more integration tests based on actual data

* Generated documentation files

* Add documentation

* Fixed failing function count test

* Add tests that push-to-source works for ST_CONTAINS and ST_WITHIN

* Test more combinations of WITH/CONTAINS and literal on right and left

This also verifies that the re-writing of CONTAINS to WITHIN or vice versa occurs when the literal is on the left.

* test that physical planning also handles doc-values from STATS

* Added more tests for WITHIN/CONTAINS together with CENTROID

This should test the doc-values for points.

* Add cartesian_point tests

* Add cartesian_shape tests

* Disable Lucene-push-down for CARTESIAN data

This is a limitation in Lucene, which we could address as a performance optimization in a future PR, but since it probably requires Lucene changes, it cannot be done in this work.

* Fix doc links

* Added test data and tests for cartesian multi-polygons

Testing INTERSECTS, CONTAINS and WITHIN with multi-polydon fields

* Use required features for spatial points, shapes and centroid

* 8.13.0 is not yet historical version

This needs to be reverted as soon as 8.13.0 is released

* Added st_intersects and st_contains_within 'features'

* Code review updates

* Re-enable lucene push-down

* Added more required_features

* Fix point contains non-point

* Fix point contains point

* Re-enable lucene push-down in tests too

Forgot to change the physical planner unit tests after re-enabling lucene push-down

* Generate automatic docs

* Use generated examples docs

* Generated examples use '-result' prefix (singular)

* Mark spatial functions as preview/experimental
2024-04-02 10:31:00 +02:00
Nik Everett
00b0c54a74
ESQL: Generate docs for the trig functions (#106891)
This updates the in-code docs on the trig functions to line up with the
docs, removes the docs, and uses the now mostly identical generated
docs. This means we only need to document these functions in one place -
right next to the code.
2024-03-29 12:24:31 -04:00
Ioana Tagirta
b85d4b1dbb
Fix typo in functions/README.md (#106870) 2024-03-28 14:02:47 +01:00
Luigi Dell'Aquila
3e406e2d57
ES|QL: Improve support for TEXT fields in functions (#106810)
Re-submitting https://github.com/elastic/elasticsearch/pull/106688 after
a revert due to a conflict after merge
2024-03-27 08:47:09 -04:00
Luigi Dell'Aquila
720188e95f Revert "ES|QL: Improve support for TEXT fields in functions (#106688)"
This reverts commit 62e3e5fd1b.
2024-03-27 12:14:02 +01:00
Luigi Dell'Aquila
62e3e5fd1b
ES|QL: Improve support for TEXT fields in functions (#106688) 2024-03-27 12:08:10 +01:00
Nik Everett
d6d1edd529 ESQL: Fix typo in docs readme
s/and/are/
2024-03-25 09:22:20 -04:00
Nik Everett
35fcc9a29d
ESQL: Add README.md to docs (#106698)
This explains how to run the tests that build the docs. I tried to add
it in #106577 but the sync code deleted it. So I fixed that too.
2024-03-22 16:30:59 -04:00
Nik Everett
7c46c735e4
ESQL: Generate docs for ceil (#106616)
This replaces the hand maintained docs for `CEIL` and with the docs
generated by the tests. There shouldn't be any diff in the generated
docs.
2024-03-21 13:01:15 -04:00
Nik Everett
fa00e6176f
ESQL: Values aggregation function (#106065)
This creates the `VALUES` aggregation function which buffers all field
values it receives and emits them as a multivalued field. It can use a
significant amount of memory and will circuit break if it uses too much
memory, but it's really useful for putting together self-join-like
behavior. It sort of functions as a stop-gap measure until we have more
self-join style things.

In the future we'll have spill-to-disk for aggregations and, likely,
some kind of self-join command for aggregations at least so this will be
able to grow beyond memory. But for now, memory it is.

Example:

```
  FROM employees
| EVAL first_letter = SUBSTRING(first_name, 0, 1)
| STATS first_name=VALUES(first_name) BY first_letter
| SORT first_letter
;

                                        first_name:keyword | first_letter:keyword
            [Anneke, Alejandro, Anoosh, Amabile, Arumugam] | A
[Bezalel, Berni, Bojan, Basil, Brendon, Berhard, Breannda] | B
                  [Chirstian, Cristinel, Claudi, Charlene] | C
                      [Duangkaew, Divier, Domenick, Danel] | D
```

I made this work for everything but `geo_point` and `cartesian_point`
because I'm not 100% sure how to integrate with those. We can grab those
in a follow up.

Closes #103600
2024-03-21 12:52:04 -04:00
Nik Everett
34899069b6
ESQL: Generate a few more docs (#106577)
And improve the error message on csv test failures.
2024-03-21 10:51:35 -04:00
Nik Everett
a1305373f2
ESQL: Use generated docs for abs and acos (#106510)
ESQL: Use generated docs for abs and acos
2024-03-20 16:32:12 -04:00
Fang Xing
5d05d81854
[ES|QL] Remove variadic functions' optional args from the output of show functions (#106454)
* remove optional args from the output of show functions optionalArgs, argNames and argTypes for variadic functions
* consistent names for arguments
2024-03-20 10:15:49 -04:00
Nik Everett
1541da5e65
ESQL: Generate more docs (#106367)
This modifies the ESQL test infrastructure to generate more of the
documentation for functions. It generates the *Description* section, the
*Examples* section, and the *Parameters* section as separate files so we
can use them as needed. It also generates a `layout` file that's just
a guess as to how to render the whole thing. In some cases it'll work
and we can use that instead of hand maintaining a "top level"
description file for the function.

Most newly generated files are unused. We have to chose to pick them up
by replacing the sections we were manually maintaining with an include
of the generated section. Or by replacing the entire hand maintained
file with the generated top level file.

Relates to #104247
2024-03-19 15:40:13 -04:00
Craig Taverner
e14dd54ae9
Support ST_INTERSECTS between two geometry columns (#104907)
* Support ST_INTERSECTS between geometry column and other geometry or string

* Pushdown to lucene for ST_INTERSECTS on GEO_POINT

* Get geo_shape working in ST_INTERSECTS bypassing SingleValueQuery

* Initial work to support cartesian shape queries in ESQL

* Fixed CSV tests for combined ST_INTERSECTS and ST_CENTROID

* Fixed bug in point-in-shape query for CARTESIAN_POINT

* Added unit tests for SpatialIntersects and fixed a few bugs found

* Added comments to public ShapeQueryBuilder class

* Move calls to random() later to avoid security exception

* Refined type checking support in ST_INTERSECTS

Improved the combinations supported as preparation for removing the uly try/catch way of detecting the difference between WKT and WKB in some code.

* Fixed bugs in incorrect use of doc-values in parameter type matching

Also made a few reminfments, including removing one try/catch approach to differentiating between WKT and WKB.

* Removed second place where we used try/catch to differentiate WKT from WKB

This was a workaround for a mistake in the planning, where we incorrectly mapped incoming types to the wrong FieldEvaluators. We fixed that mistake in an earlier commit.

* Fixed flaky tests were GEO was treated as CARTSIAN

We assumed if the incoming types were constants, they had no CRS, even when they did, which was wrong. For shapes crossing the dateline this lead to different (incorrect) behaviour.

* Fixed a flaky test by removing some point==point optimizations

* Moved spatial intersects to 'spatial' package

When we developed the ST_CENTROID work, this was requested, so let's do it here too.

* Use normal switch on enums

* Cleanup some static utility methods

Now all code paths that can convert a constant string to a geometry use the same code.

* Fixed bugs with non-quantized coordinates, and cleaned up code a little

* Fixed failing test after change to evaluator class names

* Refactored SpatialRelatesFunction into three files, and made evaluatorRules static

This was a general cleanup, making the code more organized, but did also achieve static evaluator rules so we don't re-created these on every query parsing.

* Fixed compile error after rebase

* Removed ConstantAndConstant support, using fold() correctly instead

* better error on circles

* Make sure compound predicates are supported in use-doc-values pushdown

* Testing ENRICH with ST_INTERSECTS

This required adding new data for an ENRICH index, and this data could be tested with a few other related tests, which were also added.

* Added missing mixed-cluster rules for testing only with 8.14

* Fixed some mixed-cluster issues where we failed to mark test for only 8.14

Also added an interesting polygon-polygon intersection case from real data.

* Fix flaky test where cartesian polygons were generated from geo

* Remove support for string literals in ST_INTERSECTS

* Fix failing tests after removing string support

* Removed unused code from previous string literal support (WKT parsing)

* Support case where both fields are points and doc-values

If we have an ST_INTERSECTS and an ST_CENTROID, the centroid asks to load the points as doc-values, and the ST_INTERSECTS needs to therefor support two doc-values points.

* Disallow more than one field from doc-values for ST_INTERSECTS

* Remove unused evaluator classes

* Add tests for multiple doc-values if not in same intersects

* Fix errors after rebase on main

* Fixed bug in missing support for spatial function expressions in EVAL

When a spatial aggregate expects doc-values, this was not being communicated to spatial functions in EVAL, only in WHERE.

* Reduce flaky tests when reading directly from enrich source indices

The test framework does not expect enrich source indices to be used directly in queries, leading to duplicated results on multi-node clusters, so we edit the queries to be less sensitive to this case.

* Fixed failing test

* Code style

* Fixed test file name and added function name annotation

* Added documentation for st_intersects

* Fixed failing show functions test

* Code review changes, notably simplifying the type resolution

* Fixed broken docs link
2024-03-19 17:58:37 +01:00