This PR addresses issues around aggregations cancellation, mentioned in https://github.com/elastic/elasticsearch/issues/108701 and other places. In brief, during aggregations collection time, we respect cancellation via the mechanisms in the searcher to poison cancelled queries. But once the aggregation finishes collection, there is no further need to interact with the searcher, so we cannot rely on that for cancellation checking. In particular, deeply nested aggregations can spend a long time constructing the results tree.
Checking for cancellation is a trade off, as the check itself is somewhat expensive (it involves a volatile read), so we want to balance checking often enough that cancelled queries aren't taking up resources for a long time, but not so frequently that it slows down most aggregation queries. Our first attempt to this is to check once when we go to build sub-aggregations, as the worst cases for this that we've seen involve needing to build deep sub-aggregation trees. Checking at sub-aggregation construction time also provides a conveniently centralized method call to add the check to.
---------
Conflicts:
server/src/main/java/org/elasticsearch/search/aggregations/bucket/BucketsAggregator.java
test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
* [8.x] ESQL: use field_caps native nested fields filtering (#117201) (#117375) (#121645)
* Just filter the nested fields natively with field_caps support
(cherry picked from commit 73381dbeb1)
* Add import
If the `MasterService` needs to log a create-snapshot task description
then it will call `CreateSnapshotTask#toString`, which today calls
`RepositoryData#toString` which is not overridden so ends up calling
`RepositoryData#hashCode`. This can be extraordinarily expensive in a
large repository. Worse, if there's masses of create-snapshot tasks to
execute then it'll do this repeatedly, because each one only ends up
yielding a short hex string so we don't reach the description length
limit very easily.
With this commit we provide a more efficient implementation of
`CreateSnapshotTask#toString` and also override
`RepositoryData#toString` to protect against some other caller running
into the same issue.
Adding `security_solution-*-*` in list of index nae to avoid the pattern collisions.
(cherry picked from commit 0638d3977a)
Co-authored-by: Smriti <152067238+smriti0321@users.noreply.github.com>
* Improve memory aspects of enrich cache (#120256)
This commit reduces the occupied heap space of the enrich cache and
corrects inaccuracies in tracking the occupied heap space (for cache
size limitation purposes).
---------
Co-authored-by: Joe Gallo <joegallo@gmail.com>
* Fix compilation
---------
Co-authored-by: Joe Gallo <joegallo@gmail.com>
Make it explicit that es expects disks to have the same capacity across all the nodes in the same data tier.
(cherry picked from commit 3ebc1f48aa)
Co-authored-by: Ievgen Degtiarenko <ievgen.degtiarenko@elastic.co>
We are creating tmp files that might not get closed if an exception happens just after it. This commit makes sure all
errors are handle properly and files are getting closed and deleted.
# Conflicts:
# muted-tests.yml
Fixes two bugs in _resolve/cluster.
First, the code that detects older clusters versions and does a fallback to the _resolve/index
endpoint was using an outdated string match for error detection. That has been adjusted.
Second, upon security exceptions, the _resolve/cluster endpoint was marking the clusters as connected: true,
under the assumption that all security exceptions related to cross cluster calls and remote index access were
coming from the remote cluster, but that is not always the case. Some cross-cluster security violations can
be detected on the local querying cluster after issuing the remoteClient.execute call but before the transport
layer actually sends the request remotely. So we now mark the connected status as false for all ElasticsearchSecurityException cases. End user docs have been updated with this information.