Commit graph

26 commits

Author SHA1 Message Date
Tobias Stadler
e3deacf547
[DOCS] Fix typos (#83895) 2022-02-15 12:42:17 -05:00
Bogdan Pintea
ca76121ac0
List unsigned_long type as supported SQL type (#83480)
Add unsigned_long as an (experimentally) supported SQL type and its
limitations.
2022-02-04 12:14:56 +01:00
James Rodewig
f56a0f4b66
[DOCS] Remove testenv annotations from doc snippet tests (#80023)
Removes `testenv` annotations and related code. These annotations originally let you skip x-pack snippet tests in the docs. However, that's no longer possible.

Relates to #79309, #31619
2021-11-05 18:38:50 -04:00
Andrei Stefan
bf1b7a36b5
SQL: Adapt the limitations page to the new "fields" API usage (#69616)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>
2021-03-02 17:19:32 +02:00
Bogdan Pintea
638402c387
Abort sorting in case of local agg sort queue overflow (#65687)
In case the local agg sorter queue gets full and no limit has been provided,
the local sorter will now erroneously call the failure callback for every
single row in the original rowset that's left over the local queue limit
(instead for just the first one).  The failure response is dispatched in any
case, so this is relatively harmless.  The sorter continues iterating on the
original response fetching subsequent pages. In case of correct Elasticsearch
behaviour, this is also harmless, it'll just trigger a number of internal
exceptions. However, in case of a pagination defect in Elasticsearch (like
GH#65685, where the same search_after is returned), this will result in an
effective spin loop, potentially rendering eventually the node unresponsive.

This PR simply breaks both the inner loop iterating over the current unsorted
rowset, as well as the outer one, iterating over the left pages.

It also fixes an outdated documentation limitation.
2020-12-03 19:19:15 +01:00
James Rodewig
2774cd6938
[DOCS] Swap [float] for [discrete] (#60124)
Changes instances of `[float]` in our docs for `[discrete]`.

Asciidoctor prefers the `[discrete]` tag for floating headings:
https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks
2020-07-23 11:48:22 -04:00
Marios Trivyzas
506d1beea7
SQL: Implement scripting inside aggs (#55241)
Implement the use of scalar functions inside aggregate functions.
This allows for complex expressions inside aggregations, with or without
GROUBY as well as with or without a HAVING clause. e.g.:

```
SELECT MAX(CASE WHEN a IS NULL then -1 ELSE abs(a * 10) + 1 END) AS max, b
FROM test
GROUP BY b
HAVING MAX(CASE WHEN a IS NULL then -1 ELSE abs(a * 10) + 1 END) > 5
```

Scalar functions are still not allowed for `KURTOSIS` and `SKEWNESS` as
this is currently not implemented on the ElasticSearch side.

Fixes: #29980
Fixes: #36865
Fixes: #37271
2020-04-17 11:22:06 +02:00
Marios Trivyzas
b847742b23
SQL: [Docs] Fix typo
Add missing closing "`".

Follows: 78a1185549
2020-02-12 21:48:36 +01:00
Marios Trivyzas
78a1185549
SQL: [Docs] Add limitation for sorting on aggs (#52210)
Add a section to point out that when ordering by an aggregate
only plain aggregate functions are allowed, no scalars/operators
can be used on top of them.

Fixes: #52204
2020-02-12 12:54:42 +01:00
Marios Trivyzas
f41efd6753
SQL: Fix ORDER BY YEAR() function (#51562)
Previously, if YEAR() was used as and ORDER BY argument without being
wrapped with another scalar (e.g. YEAR(birth_date) + 10), no script
ordering was used but instead the underlying field (e.g. birth_date)
was used instead as a performance optimisation. This works correctly if
YEAR() is the only ORDER BY arg but if further args are used as tie
breakers for the ordering wrong results are produced. This is because
2 rows with the different birth_date but on the same year are not tied
as the underlying ordering is on birth_date and not on the
YEAR(birth_date), and the following ORDER BY args are ignored.

Remove this optimisation for YEAR() to avoid incorrect results in
such cases.

As a consequence another bug is revealed: scalar functions on top
of nested fields produce scripted sorting/filtering which is not yet
supported. In such cases no error was thrown but instead all values for
such nested fields were null and were passed to the script implementing
the sorting/filtering, producing incorrect results.

Detect such cases and throw a validation exception.

Fixes: #51224
2020-01-30 14:48:34 +01:00
Bogdan Pintea
a55b36065e
SQL:Docs: add the PIVOT clause to SELECT section (#49129)
The PR adds the documentation on the PIVOT clause.
2019-11-20 16:53:16 +01:00
Andrei Stefan
8bf8a055e3
Switch from using docvalue_fields to extracting values from _source (#44062)
* Switch from using docvalue_fields to extracting values from _source
where applicable. Doing this means parsing the _source and handling the
numbers parsing just like Elasticsearch is doing it when it's indexing
a document.
* This also introduces a minor limitation: aliases type of fields that
are NOT part of a tree of sub-fields will not be able to be retrieved
anymore. field_caps API doesn't shed any light into a field being an
alias or not and at _source parsing time there is no way to know if a
root field is an alias or not. Fields of the type "a.b.c.alias" can be
extracted from docvalue_fields, only if the field they point to can be
extracted from docvalue_fields. Also, not all fields in a hierarchy of
fields can be evaluated to being an alias.
2019-07-24 13:59:02 +03:00
Marios Trivyzas
079e012fde
SQL: Increase hard limit for sorting on aggregates (#43220)
To be consistent with the `search.max_buckets` default setting,
set the hard limit of the PriorityQueue used for in memory sorting,
when sorting on an aggregate function, to 10000.

Fixes: #43168
2019-06-14 13:26:18 +02:00
Igor Motov
0b94416cc1
SQL: Add initial geo support (#42031)
Adds an initial limited implementations of geo features to SQL. This implementation is based on the [OpenGIS® Implementation Standard for Geographic information - Simple feature access](http://www.opengeospatial.org/standards/sfs), which is the current standard for GIS system implementation. This effort is concentrate on SQL option AKA ISO 19125-2. 

## Queries that are supported as a result of this initial implementation

###  Metadata commands

- `DESCRIBE table`  - returns the correct column types `GEOMETRY` for geo shapes and geo points.
- `SHOW FUNCTIONS` - returns a list that includes supported `ST_` functions
- `SYS TYPES` and `SYS COLUMNS` display correct types `GEO_SHAPE` and `GEO_POINT` for geo shapes and geo points accordingly. 

### Returning geoshapes and geopoints from elasticsearch

- `SELECT geom FROM table` - returns the geoshapes and geo_points as libs/geo objects in JDBC or as WKT strings in console.
- `SELECT ST_AsWKT(geom) FROM table;` and `SELECT ST_AsText(geom) FROM table;`- returns the geoshapes ang geopoints in their WKT representation;

### Using geopoints to elasticsearch

- The following functions will be supported for geopoints in queries, sorting and aggregations: `ST_GeomFromText`, `ST_X`, `ST_Y`, `ST_Z`, `ST_GeometryType`, and `ST_Distance`. In most cases when used in queries, sorting and aggregations, these function are translated into script. These functions can be used in the SELECT clause for both geopoints and geoshapes. 
- `SELECT * FROM table WHERE ST_Distance(ST_GeomFromText(POINT(1 2), point) < 10;` - returns all records for which `point` is located within 10m from the `POINT(1 2)`. In this case the WHERE clause is translated into a range query.

## Limitations:

Geoshapes cannot be used in queries, sorting and aggregations as part of this initial effort. In order to fully take advantage of geoshapes we would need to have access to geoshape doc values, which is coming in #37206. `ST_Z` cannot be used on geopoints in queries, sorting and aggregations since we don't store altitude in geo_point doc values.

Relates to #29872
2019-05-13 22:17:10 -04:00
Marios Trivyzas
091d43c494
SQL: Remove CircuitBreaker from parser (#41835)
The CircuitBreaker was introduced as means of preventing a
`StackOverflowException` during the build of the AST by the parser.

The ANTLR4 grammar causes a weird behaviour for a Parser Listener.
The `enterEveryRule()` method is often called with a different parsing
context than the respective `exitEveryRule()`. This makes it difficult
to keep track of the tree's depth, and a custom Map was used as an
attempt of matching the contextes as they are encounter during `enter`
and during `exit` of the rules.

This approach had 2 important drawbacks:
1. It's hard to maintain this custom Map as the grammar changes.
2. The CircuitBreaker could often lead to false positives which caused
valid queries to return an Exception and prevent them from executing.

So, this removes completely the CircuitBreaker which is replaced be
a simple handling of the `StackOverflowException`

Fixes: #41471
2019-05-07 19:06:06 -04:00
James Rodewig
adf67053f4
[DOCS] Add anchors for Asciidoctor migration (#41648) 2019-04-30 10:19:09 -04:00
Marios Trivyzas
046ccd4cf0
SQL: Introduce SQL TIME data type (#39802)
Support ANSI SQL's TIME type by introductin a runtime-only
ES SQL time type.

Closes: #38174
2019-04-01 23:30:39 +02:00
Costin Leau
d0f60b4425
SQL: Spec tests now use classpath discovery (#40388)
To avoid having to specify each spec by hand (which can miss specs to be
added), the test infrastructure now performs classpath discovery so that
each spec added, is automatically considered.

Relates #40358
2019-03-25 15:22:59 +02:00
Costin Leau
2b35157196
SQL: Add multi_value_field_leniency inside FieldHitExtractor (#40113)
For cases where fields can have multi values, allow the behavior to be
customized through a dedicated configuration field.
By default this will be enabled on the drivers so that existing datasets
work instead of throwing an exception.
For regular SQL usage, the behavior is false so that the user is aware
of the underlying data.

Fix #39700
2019-03-18 14:56:00 +02:00
Andrei Stefan
9e0df64b2d
SQL: ignore UNSUPPORTED fields for JDBC and ODBC modes in 'SYS COLUMNS' (#39518)
* SYS COLUMNS will skip UNSUPPORTED field types in ODBC and JDBC, as well.
NESTED and OBJECT types were already skipped in ODBC mode, now they are
skipped in JDBC mode, as well.
2019-03-01 15:23:15 +02:00
Costin Leau
fb6e7a30c9
SQL: remove beta marker from documentation (#38661) 2019-02-10 00:09:03 +02:00
Costin Leau
783c9ed372
SQL: Allow sorting of groups by aggregates (#38042)
Introduce client-side sorting of groups based on aggregate
functions. To allow this, the Analyzer has been extended to push down
to underlying Aggregate, aggregate function and the Querier has been
extended to identify the case and consume the results in order and sort
them based on the given columns.
The underlying QueryContainer has been slightly modified to allow a view
of the underlying values being extracted as the columns used for sorting
might not be requested by the user.

The PR also adds minor tweaks, mainly related to tree output.

Close #35118
2019-02-02 01:38:25 +02:00
Marios Trivyzas
19dccf8f3e
SQL: [Docs] Add limitation for aggregate functions on scalars (#38186)
Currently aggregate functions can operate only directly on fields.
They cannot be used on top of scalar functions as painless scripting
is currently not supported.
2019-02-01 16:13:51 +02:00
Marios Trivyzas
4710a7472f
SQL: Implement FIRST/LAST aggregate functions (#37936)
FIRST and LAST can be used with one argument and work similarly to MIN
and MAX but they are implemented using a Top Hits aggregation and
therefore can also operate on keyword fields. When a second argument is
provided then they return the first/last value of the first arg when its
values are ordered ascending/descending (respectively) by the values of
the second argument. Currently because of the usage of a Top Hits
aggregation FIRST and LAST cannot be used in the HAVING clause of a
GROUP BY query to filter on the results of the aggregation.

Closes: #35639
2019-01-31 16:33:05 +02:00
Andrei Stefan
39a072389c
SQL: add sub-selects to the Limitations page (#37012) 2019-01-07 10:08:51 +02:00
Andrei Stefan
09fa827adc
SQL: documentation improvements and updates (#36918)
* Added Limitations page
* Made the aggregations page follow the common template for functions
* Modified all tables to have the first row's cells content centered
* Polishing in other various sections
2018-12-21 23:25:54 +02:00