Here we add support for the following two ESQL functions:
* LTRIM: remove leading spaces from a string
* RTRIM: remove trailing spaces from a string
We also fix an issue with the handling of unicode white spaces. We
make use of unicode code points to identify unicode whitespace
characters instead of relying on ASCII codes.
Moreover, iterating bytes in a Unicode string needs to consider
that some Unicode characters are encoded using multiple bytes.
* Sqrt function for ESQL
Introduces a unary scalar function for square root, which is a thin
wrapper over the Java.Math implementation.
* Fix area for ESQL integration changelog.
* Restore changelog.
* Restore area in changelog.
This adds the `to_degrees` and `to_radians` functions. It uses the
"convert" function framework because that just felt right - these
convert between radians and degrees after all.
This adds support for numeric fields to `auto_bucket` and adds a new
`floor` function to round numeric down to the nearest integer. That
function is exposed because it's probably useful. I added it in this PR
because `auto_bucket` uses it as an implementation detail as well.
To implement this we:
* Cast both arguments to double
* Perform integer and long validation on the double results before casting back to integer or long
* Perform a special case validation for exponent==1
* Any validation failures result in ArithmeticException, which is caught and added to warnings
Introduces a unary scalar function for base 10 log, which is a thin
wrapper over the Java.Math implementation
---------
Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>
This adds support for the `unsigned_long` type.
The type can be now used with the defined math function, both scalar and
MV'ed, arithmetic and binary comparison ones.
The `to_unsigned_long()` conversion function is also added.
This implements the `MV_DEDUPE` function that removes duplicates from
multivalues fields. It wasn't strictly in our list of things we need in
the first release, but I'm grabbing this now because I realized I needed
very similar infrastructure when I was trying to build grouping by
multivalued fields. In fact, I realized that I could use our
stringtemplate code generation to generate most of the complex parts.
This generates the actual body of `MV_DEDUPE`'s implementation and the
body of the `Block` accepting `BlockHash` implementations. It'll be
useful in the final step for grouping by multivalued fields.
I also got pretty curious about whether the `O(n^2)` or `O(n*log(n))`
algorithm for deduplication is faster. I'd been assuming that for all
reasonable sized inputs the `O(n^2)` bubble sort looking selection
algorithm was faster. So I measured it. And it's mostly true - even for
`BytesRef` if you have a dozen entries the selection algorithm is
faster. Lower overhead and stuff. Anyway, to measure it I had to
implement the copy-and-sort `O(n*log(n))` algorithm. So while I was
there I plugged it in and selected it in cases where the number of
inputs is large and the selection alogorithm is likely to be slower.
Adds an `mv_join` function that joins together multivalue string fields.
You can combine this with out fancy new `to_string` to join together any
multivalued fields into a string.
This adds a `mv_median` function that converts a multivalued field into
a single valued field by picking the median. If there are an even number
of values we return the average of the middle two numbers. If the input
type is `int` or `long` then the average rounds *down*.