Fixes https://github.com/elastic/elasticsearch/issues/107733
Number overflow can return warnings with `InvalidArgumentException` or
`QlIllegalArgumentException` depending on the version of the node where
it's executed
* Make COUNT(constant) consistent.
* Add MEDIAN(const) and COUNT_DISTINCT(const).
* Fix wrong stats pushdown when multiple COUNT aggs are in the same STATS
* Adding test class
* Finishing a test
* Testing timeout from params
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
This fixes the queries reusing in STATS aggs part of expressions with
`BUCKET` declared in STATS grouping part.
Ex: `| STATS BUCKET(salary, 1000.) + 1 BY BUCKET(salary, 1000.)`
This was failing since the agg BUCKET's `salary` reference is no longer
available in the synthetic EVAL generated on top of the aggregation,
evaluating the "aggs" expression (the addition in the example above).
The hot threads API does not support a `?master_timeout` parameter, and
the `?timeout` parameter is not an ack timeout and defaults to an
infinite wait. This commit fixes the incorrect docs.
* Fix a test where pre-8.13 nodes in mixed cluster tests were sent a language version (added in 8.13.3).
* Skip a test for a fix introduced in 8.13.2 on older clusters.
This is mostly a refactoring but also a minor bug-fix.
The bug-fix avoids potentially many unnecessary realm cache invalidations when
the state of the .security index changes.
Backport of #107360
Fixes a bug in the async-search status endpoint where a user with monitor privileges
is not able to access the status endpoint when setting keep_alive state of the async-search.
This will check and fail if certain functions would generate a result
exceeding a certain fixed byte size.
This prevents an operation/query to fail the entire VM.
This page was split up in #104614 but the `ReferenceDocs` symbol links
to the top-level page still rather than the correct subpage. This fixes
the link.
Since #104614 the top-level repo troubleshooting page is just a short
paragraph which talks about "this page" but in fact refers to
information spread across a number of subsequent pages. It's not obvious
to the reader that they need to use the navigation menu to get to the
information they seek. Moreover we link to this page from an exception
message today so there's a reasonable chance that users will find it
when trying to troubleshoot a genuine problem.
This commit rewords things slightly and adds links to the subsequent
pages to the body of the page to avoid this confusion.
During the fetch phase, there's a number of stored fields that are requested explicitly or loaded by default. That information is included in `StoredFieldsSpec` that each fetch sub phase exposes.
We attempt to provide stored fields that are already loaded to the fields lookup that scripts as well as value fetchers use to load field values (via `SearchLookup`). This is done in `PreloadedFieldLookupProvider.` The current logic makes available values for fields that have been found, so that scripts or value fetchers that request them don't load them again ad-hoc. What happens though for stored fields that don't have a value for a specific doc, is that they are treated like any other field that was not requested, and loaded again, although they will not be found, which causes overhead.
This change makes available to `PreloadedFieldLookupProvider` the list of required stored fields, so that it can better distinguish between fields that we already attempted to load (although we may not have found a value for them) and those that need to be loaded ad-hoc (for instance because a script is requesting them for the first time).
This is an existing issue, that has become evident as we moved fetching of metadata fields to `FetchFieldsPhase`, that relies on value fetchers, and hence on `SearchLookup`. We end up attempting to load default metadata fields (`_ignored` and `_routing`) twice when they are not present in a document, which makes us call `LeafReader#storedFields` additional times for the same document providing a `SingleFieldVisitor` that will never find a value.
Another existing issue that this PR fixes is for the `FetchFieldsPhase` to extend the `StoredFieldsSpec` that it exposes to include the metadata fields that the phase is now responsible for loading. That results in `_ignored` being included in the output of the debug stored fields section when profiling is enabled. The fact that it was previously missing is an existing bug (it was missing in `StoredFieldLoader#fieldsToLoad`).
Yet another existing issues that this PR fixes is that `_id` has been until now always loaded on demand when requested via fetch fields or script. That is because it is not part of the preloaded stored fields that the fetch phase passes over to the `PreloadedFieldLookupProvider`. That causes overhead as the field has already been loaded, and should not be loaded once again when explicitly requested.
For really large values, rounding error is enough to push the
reconstructed value for synthetic source into infinity. Existing code
didn't take it into account. This PR adds a check to detect infinity and
just proceed with returning it as is in synthetic source.
Closes#107101.
Sometimes, CombineProjections does not correctly update an aggregation's groupings when combining with a preceding projection.
Fix this by resolving any aliases used in the groupings and de-duplicating them.
---------
Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
* Remove `es-test-dir` book-scoped variable
* Remove `plugins-examples-dir` book-scoped variable
* Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables
- In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed.
- In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path
- In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem
* Replace `es-repo-dir` with `es-ref-dir`
* Move `:include-xpack: true` to few files that use it, remove from index.asciidoc
Sometimes an exception is wrapped multiple times and then these logs are being emitted:
```
org.elasticsearch.transport.RemoteTransportException: [es-es-index-64c4d7dcd-4d7qb][10.2.58.152:9300][cluster:admin/persistent/start]
Caused by: org.elasticsearch.transport.RemoteTransportException: [es-es-index-64c4d7dcd-j7s7v][10.2.9.216:9300][cluster:admin/persistent/start]
Caused by: org.elasticsearch.ResourceAlreadyExistsException: task with id {geoip-downloader} already exist
at org.elasticsearch.persistent.PersistentTasksClusterService$1.execute(PersistentTasksClusterService.java:120)
at org.elasticsearch.cluster.service.MasterService$UnbatchedExecutor.execute(MasterService.java:550)
at org.elasticsearch.cluster.service.MasterService.innerExecuteTasks(MasterService.java:1039)
at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:1004)
at org.elasticsearch.cluster.service.MasterService.executeAndPublishBatch(MasterService.java:232)
at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.lambda$run$2(MasterService.java:1645)
at org.elasticsearch.action.ActionListener.run(ActionListener.java:356)
at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.run(MasterService.java:1642)
at org.elasticsearch.cluster.service.MasterService$5.lambda$doRun$0(MasterService.java:1237)
at org.elasticsearch.action.ActionListener.run(ActionListener.java:356)
at org.elasticsearch.cluster.service.MasterService$5.doRun(MasterService.java:1216)
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.lang.Thread.run(Thread.java:1583)
```
In this case the real cause is `ResourceAlreadyExistsException`, which should't be logged as an error. Adjusted the exception cause checking to take into account that an exeception maybe wrapped twice by a `RemoteTransportException`.
The S3 SDK permits changing the maximum number of concurrent connections
that it will open, but today there's no way to adjust this setting
within Elasticsearch. This commit adds a setting for this parameter.
I have a couple heap dumps that show the lock wrapper alone waste O(10M)
in heap for these things. Also, I suspect the indirection does cost
non-trivial performance here in some cases. => lets spend a couple more
lines of code to save that overhead
* Allow rejected executions when filling up a thread pool queue
* Move test to integration tests
* Avoid setting maxConcurrentShardRequests to 1
* Test all index descriptors defined in the Kibana plugin
I've been tracing the problems with these tests (as the [first attempt I
made](https://github.com/elastic/elasticsearch/pull/107066) was
unrelated to the actual bug).
I discovered that the actual problem was that the `BytesRefHash` for
`terms` in the `IpScriptFieldTermsQuery` was not finding terms that were
actually there.
The seed that was used to reproduce this failure was triggering multiple
slices for performing the search. As `BytesRefHash` is not a threadsafe
class, that made me think about some kind of synchronization issue with
the underlying `BytesRefHash` structure for the
`IpScriptFieldTermsQuery`
Adding a `synchronized` block to the `terms` on access removed the
problem. I've tried to reproduce the issue with > 90k iterations of the
tests and have been unable to reproduce it.
Closes#106900
Looking through some real world heap dumps, there's at times a lot of
instances of these things that have a 1k buffer allocated but are only a
couple of bytes in length. We can save tens of MB in examined cases by
just sizing the buffer smarter here.
Fixed the conflicts, and re-submitting. Please see #105217 for full details, history, and discussion. I'll use the commit message from that PR as well.
Continuing my work from #104490, this PR moves the parameter compatibility checking for Equals into the type resolution check. This is a somewhat bigger change than for Add, as there was no ES|QL base class for binary comparison operators before this. I've added EsqlBinaryComparison as that base class, and migrated all of the binary comparisons to be based off of that (except for NullEquals, see note below).
In order to maintain compatibility with the current behavior, I've kept it so that unsigned longs are only inter-operable with other unsigned longs. We've talked a lot about changing that, and I consider this work a prerequisite for that.
I've also added a bunch of test cases to Equals and NotEquals, which should have the side effect of filling out the type support table in the equals docs. As noted in the comments, I'll have follow up PRs for the other binary comparisons to add tests, but this PR is already too long.
Note about NullEquals: There is an ES|QL NullEquals class, which inherits from the QL version, but I don't think it works. I didn't see any tests or docs for it, and trying it out in the demo instance gave me a syntax error. I think we need to delve into what's going on there, but this PR isn't the right place for it.
This reverts commit 225edaf607.
Test are failing because we are computing incorrectly the doc count error for terms aggregation.
The only difference with the previous versions is that we are considering the number of empty aggregations
instead of the total number of aggregations when computing this value. Making this
change it makes the test happy.