This adds tests, supported types, and a signature image for `to_string`
and `to_version`. It also fixes the resolution of functions who's names
contain an `_`
Finally, it updates the docs for `ceil` to render the image more nicely.
Add the 'right' function, which extracts a substring beginning from its
right end (opposite function of 'left').
---------
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
CI will skip building them. Lot's of CI machines don't have font support
so they can't generate these. But all local machine have a GUI so they
can.
Also, super-lazy initialize the font so CI don't bump into it by
accident.
Closes#99018
@nik9000 Recheck out the main branch. Refactor the 'left' function to
cut the prefix string in place. But I meet a adversity that left failed
the test case 'testEvaluateInManyThreads'. I find that in multiple
thread situation, ` EvalOperator.ExpressionEvaluator eval =
evalSupplier.get(); for (int c = 0; c < count; c++) {
assertThat(toJavaObject(eval.eval(page), 0), testCase.getMatcher()); } `
toJavaObject function return a BytesRef with length=2, content is
[81,89]. However, assertThat function in junit4 receive the BytesRef
parameters that its length is 10. Can you give me some clues? I can't
find which variable is mutual.
Rerun failed test case's command: `gradlew ':x-pack:plugin:esql:test'
--tests
"org.elasticsearch.xpack.esql.expression.function.scalar.string.LeftTests.testEvaluateInManyThreads
{TestCase=Left basic test}" -Dtests.seed=44459C172243712
-Dtests.locale=lv-LV -Dtests.timezone=Asia/Irkutsk -Druntime.java=20`
Add the unary scalar function CEIL.
Analogously to FLOOR, it rounds up its argument.
- Implement CEIL, add it to the function registry and make sure it is serializable.
- Add csv tests, unit tests and docs.
- Add additional csv tests with different data types and some edge cases for both CEIL and FLOOR
- Add unit tests and update docs for FLOOR.
Locks the railroad diagrams to always use the same font, this one named
`roboto mono`. This makes sure that when we render the railroad diagrams
we always size them the same way. Because everyone has a copy of roboto
mono. Because gradle resolves that dependency.
This generates a "railroad diagram" svg image that can be embedded into
the docs for any function to explain it's syntax. It's basic, but it's
something we can iterate on.
It also generates a table of supported types from the list of types that
we test. It can be included in the docs for reference as well.
Here we add support for the following two ESQL functions:
* LTRIM: remove leading spaces from a string
* RTRIM: remove trailing spaces from a string
We also fix an issue with the handling of unicode white spaces. We
make use of unicode code points to identify unicode whitespace
characters instead of relying on ASCII codes.
Moreover, iterating bytes in a Unicode string needs to consider
that some Unicode characters are encoded using multiple bytes.
* Sqrt function for ESQL
Introduces a unary scalar function for square root, which is a thin
wrapper over the Java.Math implementation.
* Fix area for ESQL integration changelog.
* Restore changelog.
* Restore area in changelog.
This adds the `to_degrees` and `to_radians` functions. It uses the
"convert" function framework because that just felt right - these
convert between radians and degrees after all.
This adds support for numeric fields to `auto_bucket` and adds a new
`floor` function to round numeric down to the nearest integer. That
function is exposed because it's probably useful. I added it in this PR
because `auto_bucket` uses it as an implementation detail as well.
To implement this we:
* Cast both arguments to double
* Perform integer and long validation on the double results before casting back to integer or long
* Perform a special case validation for exponent==1
* Any validation failures result in ArithmeticException, which is caught and added to warnings
Introduces a unary scalar function for base 10 log, which is a thin
wrapper over the Java.Math implementation
---------
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
This adds support for the `unsigned_long` type.
The type can be now used with the defined math function, both scalar and
MV'ed, arithmetic and binary comparison ones.
The `to_unsigned_long()` conversion function is also added.
This implements the `MV_DEDUPE` function that removes duplicates from
multivalues fields. It wasn't strictly in our list of things we need in
the first release, but I'm grabbing this now because I realized I needed
very similar infrastructure when I was trying to build grouping by
multivalued fields. In fact, I realized that I could use our
stringtemplate code generation to generate most of the complex parts.
This generates the actual body of `MV_DEDUPE`'s implementation and the
body of the `Block` accepting `BlockHash` implementations. It'll be
useful in the final step for grouping by multivalued fields.
I also got pretty curious about whether the `O(n^2)` or `O(n*log(n))`
algorithm for deduplication is faster. I'd been assuming that for all
reasonable sized inputs the `O(n^2)` bubble sort looking selection
algorithm was faster. So I measured it. And it's mostly true - even for
`BytesRef` if you have a dozen entries the selection algorithm is
faster. Lower overhead and stuff. Anyway, to measure it I had to
implement the copy-and-sort `O(n*log(n))` algorithm. So while I was
there I plugged it in and selected it in cases where the number of
inputs is large and the selection alogorithm is likely to be slower.
Adds an `mv_join` function that joins together multivalue string fields.
You can combine this with out fancy new `to_string` to join together any
multivalued fields into a string.
This adds a `mv_median` function that converts a multivalued field into
a single valued field by picking the median. If there are an even number
of values we return the average of the middle two numbers. If the input
type is `int` or `long` then the average rounds *down*.