elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-29 18:03:32 -04:00

Author	SHA1	Message	Date
Carlos Delgado	6ee641bdfd	ESQL - Update WHERE command docs with MATCH and full text functions examples (#118987 )	2024-12-19 16:44:53 +01:00
Bogdan Pintea	bc3b629d8d	ESQL: Docs: add example of date bucketing with offset (#116680 ) Add an example of how to create date histograms with an offset. Fixes #114167	2024-12-18 17:12:14 +01:00
Ievgen Degtiarenko	7cf28a910e	ESQL Add esql hash function (#117989 ) This change introduces esql hash(alg, input) function that relies on the Java MessageDigest to compute the hash.	2024-12-18 09:56:42 +01:00
Mark Tozzi	1e26791515	Esql bucket function for date nanos (#118474 ) This adds support for running the bucket function over a date nanos field. Code wise, this just delegates to DateTrunc, which already supports date nanos, so most of the PR is just tests and the auto-generated docs. Resolves #118031	2024-12-13 09:25:52 -05:00
Gal Lalouche	2be4cd983f	ESQL: Support ST_EXTENT_AGG (#117451 ) This PR adds support for ST_EXTENT_AGG aggregation, i.e., computing a bounding box over a set of points/shapes (Cartesian or geo). Note the difference between this aggregation and the already implemented scalar function ST_EXTENT. This isn't a very efficient implementation, and future PRs will attempt to read these extents directly from the doc values. We currently always use longitude wrapping, i.e., we may wrap around the dateline for a smaller bounding box. Future PRs will let the user control this behavior. Fixes #104659.	2024-12-13 12:41:24 +02:00
Alexander Spies	140d88c59a	ESQL: Dependency check for binary plans (#118326 ) Make the dependency checker for query plans take into account binary plans and make sure that fields required from the left hand side are actually obtained from there (and analogously for the right).	2024-12-13 11:38:53 +01:00
Carlos Delgado	eb59b989ef	ESQL: Expand type compatibility for match function and operator (#117555 )	2024-12-09 19:56:10 +01:00
Mark Tozzi	7cd17d2185	Esql compare nanos and millis (#118027 ) Resolves #116281 Introduces support for comparing millisecond dates with nanosecond dates, without the need for casting. Millisecond dates outside of the nanosecond date range are handled correctly.	2024-12-06 09:17:32 -05:00
Tommaso Teofili	91605860ee	Term query for ES\|QL (#117359 ) This commit adds a `term` function for ES\|QL to run `TermQueries`. For example: FROM test \| WHERE term(content, "dog")	2024-12-06 07:42:48 +00:00
Craig Taverner	c7e985c3b6	Support ST_ENVELOPE and related ST_XMIN, etc. (#116964 ) Support ST_ENVELOPE and related ST_XMIN, etc. Based on the PostGIS equivalents: https://postgis.net/docs/ST_Envelope.html https://postgis.net/docs/ST_XMin.html https://postgis.net/docs/ST_XMax.html https://postgis.net/docs/ST_YMin.html https://postgis.net/docs/ST_YMax.html	2024-12-04 12:20:47 +01:00
Jan Kuipers	31508f00a1	Document ES\|QL categorize limitations (#117892 ) * Document ES\|QL categorize limitations * Update x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/grouping/Categorize.java Co-authored-by: Alexander Spies <alexander.spies@elastic.co> --------- Co-authored-by: Alexander Spies <alexander.spies@elastic.co>	2024-12-04 09:53:21 +01:00
Mark Tozzi	913e0fbca8	ESQL Date Nanos Addition and Subtraction (#116839 ) Resolves #109995 This adds support and tests for addition and subtraction of date nanos with periods and durations. It does not include support for date_diff, which is a separate ticket (#109999). The bulk of the PR is testing, the actual date math is all handled by library functions. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-12-02 14:08:07 -05:00
Jan Kuipers	ddc8b959ee	ES\|QL categorize docs (#117827 ) * Move ES\|QL categorize out of snapshot functions * Categorize docs * Add experimental + fix docs * Add experimental + fix docs	2024-12-02 16:41:02 +01:00
Nik Everett	9022cccba7	ESQL: CATEGORIZE as a BlockHash (#114317 ) Re-implement `CATEGORIZE` in a way that works for multi-node clusters. This requires that data is first categorized on each data node in a first pass, then the categorizers from each data node are merged on the coordinator node and previously categorized rows are re-categorized. BlockHashes, used in HashAggregations, already work in a very similar way. E.g. for queries like `... \| STATS ... BY field1, field2` they map values for `field1` and `field2` to unique integer ids that are then passed to the actual aggregate functions to identify which "bucket" a row belongs to. When passed from the data nodes to the coordinator, the BlockHashes are also merged to obtain unique ids for every value in `field1, field2` that is seen on the coordinator (not only on the local data nodes). Therefore, we re-implement `CATEGORIZE` as a special BlockHash. To choose the correct BlockHash when a query plan is mapped to physical operations, the `AggregateExec` query plan node needs to know that we will be categorizing the field `message` in a query containing `... \| STATS ... BY c = CATEGORIZE(message)`. For this reason, _we do not extract the expression_ `c = CATEGORIZE(message)` into an `EVAL` node, in contrast to e.g. `STATS ... BY b = BUCKET(field, 10)`. The expression `c = CATEGORIZE(message)` simply remains inside the `AggregateExec`'s groupings. Important limitation: For now, to use `CATEGORIZE` in a `STATS` command, there can be only 1 grouping (the `CATEGORIZE`) overall.	2024-11-27 17:44:55 +01:00
Craig Taverner	8c22fc479f	Make spatial search functions not preview (#117489 )	2024-11-25 17:04:48 +01:00
Aurélien FOUCRET	ff58d891a1	ES\|QL kql function. (#116764 )	2024-11-25 14:22:11 +01:00
Larisa Motova	7e801e0410	[ES\|QL] Add a standard deviation function (#116531 ) Uses Welford's online algorithm, as well as the parallel version, to calculate standard deviation.	2024-11-22 12:33:46 -10:00
Nik Everett	4ecc7518ef	ESQL: Add docs for MV_PERCENTILE (#117377 ) We built this a while back. Let's document it.	2024-11-23 06:41:18 +11:00
Nik Everett	893dfd3c9a	ESQL: Make WEIGHTED_AVG not preview (#117356 ) It's not PREVIEW.	2024-11-22 16:28:06 +00:00
Luigi Dell'Aquila	a1247b3e60	ES\|QL: fix validation of SORT by aggregate functions (#117316 )	2024-11-22 12:12:09 +01:00
Carlos Delgado	ea4b41fca8	ESQL - match operator included in non-snapshot builds (#116819 )	2024-11-21 07:45:22 +01:00
Mark Tozzi	c3f73d0319	Esql Enable Date Nanos (#117080 ) This enables date nanos support as tech preview. Basic operations, like reading values, binary comparisons, and functions that don't care about type should work, but some functions are not yet supported. Most notably, Bucket is not yet supported, although Date_Trunc is and can be used for grouping. See the docs for the full list of limitations. relates to #109352	2024-11-20 09:31:01 -05:00
Fang Xing	d33bff6468	[ES\|QL][DOCS] Add docs for date_period and time_duration (#116368 ) * add docs for date_period and time_duration	2024-11-19 07:48:35 -05:00
Bogdan Pintea	b5addca40a	ESQL: Docs: COUNT: add an explanation to the use of the 3VL (#116684 ) Add an explanation of why `... OR NULL` is needed with `COUNT(...)`. Fixes: #99954	2024-11-19 10:37:47 +01:00
Gal Lalouche	c45977a5fd	[ESQL] Update docs format (missing space before '=') (#116808 )	2024-11-14 16:05:28 +02:00
Gal Lalouche	591cd591ad	[ES\|QL] Update length docs (#116734 ) ESQL Update length docs (#116734)	2024-11-14 13:14:43 +02:00
Gal Lalouche	b4898c959f	[ES\|QL] Add support BYTE_LENGTH scalar function (#116591 ) Also added documentation and examples for BIT_LENGTH and LENGTH regarding unicode.	2024-11-13 00:42:19 +02:00
Jack Pan	0914679225	Remove trailing semicolon in REPEAT function example (#116218 ) Remove trailing semicolon in REPEAT function example (Closes #116156 )	2024-11-11 11:10:05 +01:00
Tim Grein	81fd1de76b	Add ES\|QL bit_length function (#115792 )	2024-11-07 08:51:26 +01:00
Mark Tozzi	744eb507f6	[ESQL] clean up date trunc tests (#116111 ) While working on #110008 I discovered that the Date Trunc tests were only running in folding mode, because the interval types are marked as not representable. The correct way to test this is to set the forceLiteral flag for those fields, which will (as the name suggests) force them to be literals even in non-folding tests. Doing that turned up errors in the evaluatorToString tests, which I fixed. There are two big changes here. First, the second parameter to the evaluator is a Rounding instance, not the actual interval. Since Rounding includes some information about the specific rounding in the toString results, I am just using a starts with matcher to validate the majority of the string, rather than trying to reconstruct the expected rounding string. Second, passing in a literal null for the interval parameter folds the whole expression to null, and thus a completely different toString. I added a clause in AnyNullIsNull to account for this. While I was in there, I moved some specific test cases to a different file. I know moving code is something we're trying to minimize right now, but this seemed worth it. The tests in question do not depend on the parameters of the test case, but all methods in the class get run for every set of parameters. This was causing these tests to be run many times with the same values, which bloats our test run time and test count. Moving them to a distinct class means they'll only be executed once per test run. I feel like this benefit outweighs the cost of git history complexity.	2024-11-04 15:32:53 +01:00
Chris Hegarty	2275894ca0	ES\|QL Add full-text search to the functions docs page (#116024 ) Now that the match and qstr functions are Tech Previewing, we should add them to the top-level functions doc page. Co-authored-by: Craig Taverner <craig@amanzi.com>	2024-11-01 12:04:55 +00:00
Tim Grein	6a3a447f18	Remove double "the" from median absolute deviation description (#115826 )	2024-10-31 15:25:20 +01:00
Craig Taverner	3d307e0d78	Don't return TEXT type for functions that take TEXT (#114334 ) Always return `KEYWORD` for functions that previously returned `TEXT`, because any change to the value, no matter how small, is enough to render meaningless the original analyzer associated with the `TEXT` field value. In principle, if the attribute is no longer the original `FieldAttribute`, it can no longer claim to have the type `TEXT`. This has been done for all functions: conversion functions, aggregating functions, multi-value functions. There were several that already produced `KEYWORD` for `TEXT` input (eg. ToString, FromBase64 and ToBase64, MvZip, ToLower, ToUpper, DateFormat, Concat, Left, Repeat, Replace, Right, Split, Substring), but many others that incorrectly claimed to produce `TEXT`, while this was really a false claim. This PR makes that now strict, and includes changes to the functions' units tests to disallow the tests to expect any functions output to be `TEXT`. One side effect of this change is that methods that take multiple parameters that require all of them to have the same type, will now treat TEXT and KEYWORD the same. This was already the case for functions like `Concat`, but is now also the case for `Greatest`, `Least`, `Case`, `Coalesce` and `MvAppend`. An associated change is that the type casting operator `::text` has been entirely removed. It used to map onto the `ToString` function which returned type KEYWORD, and so `::text` really produced a `KEYWORD`, which is a lie, or at least a `bug`, which is now fixed. Should we ever wish to actually produce real `TEXT`, we might love the fact that this operator has been freed up for future use (although it seems likely that function will require parameters to specify the analyzer, so might never be an operator again). ### Backwards compatibility issues: This is a change that will fail BWC tests, since we have many tests that assert on TEXT output to functions. For this reason we needed to block two scenarios: * We used the capability `functions_never_emit_text` to prevent 7 csv-spec tests and 2 yaml tests from being run against older versions that still emit text. * We used `skipTest` to also block those two yaml tests from being run against the latest build, but using older yaml files downloaded (as far back as 8.14). In all cases the change observed in these tests was simply the results columns no longer having `text` type, and instead being `keyword`. --------- Co-authored-by: Luigi Dell'Aquila <luigi.dellaquila@gmail.com>	2024-10-25 10:09:53 +02:00
Luigi Dell'Aquila	bffaabb6f5	ES\|QL: improve docs about escaping for GROK, DISSECT, LIKE, RLIKE (#115320 )	2024-10-24 09:19:46 +02:00
Mark Tozzi	82f2fb554e	fix test to not run when the FF is disabled (#114260 ) Fixes #113661 Don't run the tests when the feature is disabled. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-10-22 13:41:17 -04:00
Carlos Delgado	7ad1a0c39c	Remove snapshot build restriction for match and qstr functions (#114482 )	2024-10-15 08:07:07 +02:00
Carlos Delgado	a262eb6dbd	Add ESQL match function (#113374 )	2024-10-14 07:31:55 +02:00
Larisa Motova	2155f1bed5	[ES\|QL] Add hypot function (#114382 ) Adds a hypotenuse function	2024-10-11 09:33:45 -10:00
Nik Everett	ebe3c0f10d	ESQL: Document MV_SLICE limitations (#114162 ) `MV_SLICE` is useful, but loading values from lucene frequently sorts them so `MV_SLICE` is not as useful as you think it is. It's mostly for after, say, a `SPLIT`. This documents that and adds a link to the section on multivalues. It also moves similar docs to a separate paragraph in the docs for easier reading.	2024-10-09 05:04:36 +11:00
Drew Tate	147461f5b1	[ES\|QL] add reverse function (#113297 ) Adds a REVERSE string function	2024-10-04 12:57:37 -05:00
Mark Tozzi	60ae7463a8	[ESQL] Support datetime data type in Least and Greatest functions (#113961 ) While working on Date Nanos, I noticed that Least and Greatest didn't have support for datetime. This PR corrects that and adds tests for it. It seems to me that resolveType() is doing the wrong thing for these functions, as it accepts types that then do not have evaluator mappings, but refactoring that seems out of scope right now. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-10-04 09:06:39 -04:00
Luigi Dell'Aquila	9a652829a3	ES\|QL: provide snapshot_only info for functions (Kibana) (#113544 )	2024-10-02 09:27:05 +02:00
Mark Tozzi	122e728820	[ESQL] Add TO_DATE_NANOS conversion function (#112150 ) Resolves #111842 This adds a conversion function that yields DATE_NANOS. Mostly this is straight forward. It is worth noting that when converting a millisecond date into a nanosecond date, the conversion function truncates it to 0 nanoseconds (i.e. first nanosecond of that millisecond). This is, of course, a bit of an assumption, but I don't have a better assumption we can make. I'd thought about adding a second, optional, parameter to control this behavior, but it's important that TO_DATE_NANOS extend AbstractConvertFunction, which itself extends UnaryScalarFunction, so that it will work correctly with union types. Also, it's unlikely the user will have any better guess than we do for filling in the nanoseconds. Making that assumption does, however, create some weirdness. Consider two comparisons: TO_DATETIME("2023-03-23T12:15:03.360103847") == TO_DATETIME("2023-03-23T12:15:03.360") will return true while TO_DATE_NANOS("2023-03-23T12:15:03.360103847") == TO_DATE_NANOS("2023-03-23T12:15:03.360") will return false. This is akin to casting between longs and doubles, where things may compare equal in one type that are not equal in the other. This seems fine, and I can't think of a better way to do it, but it's worth being aware of. --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-09-26 12:03:01 -04:00
Luigi Dell'Aquila	7ba26892f3	ES\|QL: make CSV date tests more friendly for Java 23 (#113472 ) Following [this suggestion](https://github.com/elastic/elasticsearch/pull/113376#issuecomment-2370817089), switching date patterns from week years to calendar years, that have the same behavior in java <=22 and java 23.	2024-09-25 02:57:22 +10:00
Nik Everett	58021c3405	ESQL: TOP support for strings (#113183 ) Adds support to the `TOP` aggregation for `keyword` and `text` field types. Closes #109849	2024-09-24 03:00:18 +10:00
Pm Ching	d68f2fa4a6	fix a couple of docs typos (#112901 )	2024-09-20 18:34:24 +03:00
Bogdan Pintea	f7ff00f645	ESQL: Align year diffing to the rest of the units in DATE_DIFF: chronological (#113103 ) This will correct/switch "year" unit diffing from the current integer subtraction to a crono subtraction. Consequently, two dates are (at least) one year apart now if (at least) a full calendar year separates them. The previous implementation simply subtracted the year part of the dates. Note: this parts with ES SQL's implementation of the same function, which itself is aligned with MS SQL's implementation, which works equivalent to an integer subtraction. Fixes #112482.	2024-09-20 20:21:29 +10:00
Carlos Delgado	8d1b22e7bc	ESQL QSTR function (#112590 )	2024-09-19 16:34:42 +02:00
Carlos Delgado	838b5a860d	ESQL - generate docs for snapshot functions (#113080 )	2024-09-19 07:46:43 +02:00
Luigi Dell'Aquila	f7a0196b45	ES\|QL: Add 'preview' information to functions docs for Kibana (#112792 )	2024-09-12 16:49:55 +02:00

1 2 3 4 5 ...

251 commits