Commit graph

162 commits

Author SHA1 Message Date
Rene Groeschke
4d17b2193a
Update Gradle wrapper to 8.12 (#118683) (#119357)
This updates the gradle wrapper to 8.12

We addressed deprecation warnings due to the update that includes:

- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions

(cherry picked from commit ba61f8c7f7)

# Conflicts:
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/distribution/DockerCloudElasticsearchDistributionType.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/distribution/DockerUbiElasticsearchDistributionType.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/Fixture.java
#	plugins/repository-hdfs/hadoop-client-api/build.gradle
#	server/src/main/java/org/elasticsearch/inference/ChunkingOptions.java
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/migrate/build.gradle
#	x-pack/plugin/security/qa/security-basic/build.gradle
2024-12-31 08:37:28 +01:00
Rene Groeschke
581b9ab7c0
[8.16] [Gradle] Remove static use of BuildParams (#115122) (#117434)
* [Gradle] Remove static use of BuildParams (#115122)

Static fields dont do well in Gradle with configuration cache enabled.

- Use buildParams extension in build scripts
- Keep BuildParams.ci for now for easy serverless migration
-  Tweak testing doc

(cherry picked from commit 13c8aaeffa)

# Conflicts:
#	TESTING.asciidoc
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/InternalDistributionBwcSetupPlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/RestTestBasePlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/compat/compat/AbstractYamlRestCompatTestPlugin.java
#	build.gradle
#	modules/ingest-geoip/qa/full-cluster-restart/build.gradle
#	qa/mixed-cluster/build.gradle
#	x-pack/plugin/ent-search/qa/full-cluster-restart/build.gradle
#	x-pack/plugin/eql/qa/rest/build.gradle
#	x-pack/plugin/fleet/qa/rest/build.gradle
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/mapper-unsigned-long/build.gradle
#	x-pack/plugin/ml/qa/multi-cluster-tests-with-security/build.gradle
#	x-pack/plugin/security/qa/multi-cluster/build.gradle
#	x-pack/plugin/sql/qa/jdbc/build.gradle
#	x-pack/plugin/transform/qa/multi-cluster-tests-with-security/build.gradle

* Fix merge

* [Build] Fix fips testing after buildparams rework (#116934)

* More Cleanup

* [Build] Fix checkstyle exclusions on windows (#115185)

* More merge fixes

* Delete x-pack/plugin/kql/build.gradle
2024-11-27 12:34:32 +01:00
Lorenzo Dematté
3bf0197950
Fix testApmIntegration histogram assertions (#115907) 2024-10-30 21:13:28 +11:00
Nik Everett
15308027b4
ESQL: Retry test on 403 (#114450) (#114650)
Retry the async test when you get a 403 - that could be because security
has not yet booted. We should have permission to fetch everything.
2024-10-12 07:04:07 +11:00
Craig Taverner
c7da1633b1
Enable pushing Sort/Filter by ReferenceAttribute down to Lucene, and thereby optimize Sort by ST_DISTANCE (#112938) (#114604)
The ST_DISTANCE function added in #108764 was optimized for lucene pushdown in a series of followup PRs, but this did not include sorting by distance. Now this is resolved, for two key scenarios, both known to be valued by users:

* Sorting by distance:
    `FROM index | EVAL distance=ST_DISTANCE(field, literal) | SORT distance`
* Sorting and filtering by distance:
    `FROM index | EVAL distance=ST_DISTANCE(field, literal) | WHERE distance < literal | SORT distance`

The key changes required to make this work:
* Add to the EsQueryExec the appropriate sort->_geo_distance sort type
* Enhance PushTopNToSource to understand how to pushdown the sort even when there is an EVAL in between the FROM and the SORT (between the TopNExec and the EsQueryExec in the physical plan).
* Enhance PushFiltersToSource to understand how to pushdown the filter even when there is an EVAL in between the FROM and the WHERE (between the Filter and the EsQueryExec in the physical plan).

A useful bonus feature of this additional EVAL intelligence is that other, non-spatial cases are now also pushed down. In particular EVALs that are simple aliases are considered and pushed down, for both filtering and sorting.

Local benchmark results, very approximate, but show massive improvements for distanceSort and distanceFilterSort, which relate to the two cases listed above.

Benchmark	Query DSL	ESQL before this PR	ESQL after this PR	Comments
distanceFilter	10	5	5	Optimized in #109972
distanceEvalFilter	10	10000	1500	Still slow due to unnecessary EVAL
distanceSort	150	12000	160	
distanceFilterSort	20	10000	24	

NOTE: This enables pushing down sorting by any ReferenceAttribute that either refers to a sortable FieldAttribute, or to an StDistance function that itself refers to a suitable FieldAttribute of geo_point type.

---------

Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
2024-10-12 00:25:00 +11:00
Nik Everett
c7473ad8cc
ESQL: Delay construction of warnings (#114368) (#114459)
Delay construction of `Warnings` until they are needed to save memory
when evaluating many many many expressions. Most expressions won't use
warnings at all and there isn't any need to make registering warnings
super duper fast. So let's make the construction lazy to save a little
memory. It's like 200 bytes per expression which isn't much, but it's
possible to have thousands of expressions in a single query. Abusive,
but possible.

This also consolidates all `Warnings` usages to a single `Warnings`
class. We had two. We don't need two.
2024-10-10 10:13:44 +11:00
Nik Everett
09a50e504d
ESQL: Weaken test assertion (#114336) (#114351)
Weaken the assertion when testing breakers: it's ok to break while
building a block in addition to topn.
2024-10-10 08:13:40 +11:00
Nik Everett
69a23d41e4
ESQL: Reenable part of heap attack test (#114252) (#114255)
This reenables a test and adds more debugging to another one. We'll use
this to collect more information the next time it fails.
2024-10-08 09:17:53 +11:00
Michael Peterson
7d02f5cb71
Collect and display execution metadata for ES|QL cross cluster searches (#112595) (#113820)
Enhance ES|QL responses to include information about `took` time (search latency), shards, and
clusters against which the query was executed.

The goal of this PR is to begin to provide parity between the metadata displayed for 
cross-cluster searches in _search and ES|QL.

This PR adds the following features:
- add overall `took` time to all ES|QL query responses. And to emphasize: "all" here 
means: async search, sync search, local-only and cross-cluster searches, so it goes
beyond just CCS.
- add `_clusters` metadata to the final response for cross-cluster searches, for both
async and sync search (see example below)
- tracking/reporting counts of skipped shards from the can_match (SearchShards API)
phase of ES|QL processing
- marking clusters as skipped if they cannot be connected to (during the field-caps
phase of processing)

Out of scope for this PR:
- honoring the `skip_unavailable` cluster setting
- showing `_clusters` metadata in the async response **while** the search is still running
- showing any shard failure messages (since any shard search failures in ES|QL are
automatically fatal and _cluster/details is not shown in 4xx/5xx error responses). Note that 
this also means that the `failed` shard count is always 0 in ES|QL `_clusters` section.

Things changed with respect to behavior in `_search`:
- the `timed_out` field in `_clusters/details/mycluster` was removed in the ESQL
response, since ESQL does not support timeouts. It could be added back later
if/when ESQL supports timeouts.
- the `failures` array in `_clusters/details/mycluster/_shards` was removed in the ESQL
response, since any shard failure causes the whole query to fail.

Example output from ES|QL CCS:

```es
POST /_query
{
  "query": "from blogs,remote2:bl*,remote1:blogs|\nkeep authors.first_name,publish_date|\n limit 5"
}
```

```json
{
  "took": 49,
  "columns": [
    {
      "name": "authors.first_name",
      "type": "text"
    },
    {
      "name": "publish_date",
      "type": "date"
    }
  ],
  "values": [
    [
      "Tammy",
      "2009-11-04T04:08:07.000Z"
    ],
    [
      "Theresa",
      "2019-05-10T21:22:32.000Z"
    ],
    [
      "Jason",
      "2021-11-23T00:57:30.000Z"
    ],
    [
      "Craig",
      "2019-12-14T21:24:29.000Z"
    ],
    [
      "Alexandra",
      "2013-02-15T18:13:24.000Z"
    ]
  ],
  "_clusters": {
    "total": 3,
    "successful": 2,
    "running": 0,
    "skipped": 1,
    "partial": 0,
    "failed": 0,
    "details": {
      "(local)": {
        "status": "successful",
        "indices": "blogs",
        "took": 43,
        "_shards": {
          "total": 13,
          "successful": 13,
          "skipped": 0,
          "failed": 0
        }
      },
      "remote2": {
        "status": "skipped",  // remote2 was offline when this query was run
        "indices": "remote2:bl*",
        "took": 0,
        "_shards": {
          "total": 0,
          "successful": 0,
          "skipped": 0,
          "failed": 0
        }
      },
      "remote1": {
        "status": "successful",
        "indices": "remote1:blogs",
        "took": 47,
        "_shards": {
          "total": 13,
          "successful": 13,
          "skipped": 0,
          "failed": 0
        }
      }
    }
  }
}
```

Fixes https://github.com/elastic/elasticsearch/issues/112402 and https://github.com/elastic/elasticsearch/issues/110935
2024-10-01 08:02:58 +10:00
Mark Vieira
0279c0a909
Add AGPLv3 as a supported license 2024-09-13 14:30:33 -07:00
Lorenzo Dematté
68a18305ef
Add more logging to MetricsApmIT to identify why test fails (#111360) 2024-07-30 08:21:46 +02:00
Nik Everett
6db07907e1
ESQL: Allow suppressed exeption in HeapAttack (#111173)
We're ok with suppresses exceptions in this test.

Closes #111128
2024-07-23 06:23:48 +10:00
Nik Everett
0710a4495f
ESQL: Check for circuit breaker in async (#111070)
This adds a test to make sure that circuit breakers exceptions thrown
from the async ESQL task are properly recorded to the index.
2024-07-19 07:57:16 -04:00
Simon Cooper
c752fbab35
Don't run the JVM crash test on windows (#110194) 2024-06-26 17:05:16 +01:00
Simon Cooper
390f91505e
Add a test plugin with a handler to crash the JVM (#109930) 2024-06-21 08:25:12 +01:00
Benjamin Trent
08298dcd69 Merge remote-tracking branch 'upstream/main' into lucene_snapshot_9_11 2024-06-12 08:05:36 -04:00
David Turner
b99b5d5f25
Remove unused seek-tracking plugin (#109600)
This was used for some performance investigations but is not currently
needed, and would need updating in order to complete #100878. Instead,
this commit removes it.
2024-06-12 06:40:00 +01:00
Benjamin Trent
9cd123d6cc Merge remote-tracking branch 'upstream/main' into lucene_snapshot_9_11 2024-06-02 16:46:19 -04:00
David Turner
2c0ad093ef
Fix up misc master-node timeouts in xpack/plugin (#109232)
More simple cases of #107984.
2024-05-31 15:54:11 +01:00
elasticsearchmachine
a40ead0f09 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-27 10:02:23 +00:00
Nik Everett
23efbf0646
ESQL: Stop sending version in tests (#108961)
Now that `version` is no longer required anywhere we can stop sending it
in all of our tests.

Closes #108957
2024-05-23 14:32:13 -04:00
elasticsearchmachine
46a9663957 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-14 10:01:36 +00:00
Nik Everett
6884002aab
ESQL: Disable heap attack eval (#108581)
It started failing again today.

Relates https://github.com/elastic/elasticsearch-serverless/issues/1874
2024-05-13 14:46:54 -04:00
elasticsearchmachine
a2c947eea1 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-09 10:02:02 +00:00
Luigi Dell'Aquila
c9b8d7239f
ES|QL: account for page overhead when calculating memory used by blocks (#108347) 2024-05-09 08:56:07 +02:00
elasticsearchmachine
6f46ee51c8 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-03 10:01:31 +00:00
Luigi Dell'Aquila
b277d5b033
ES|QL: limit query depth to 500 levels (#108089) 2024-05-02 17:42:41 +02:00
elasticsearchmachine
2f6283d181 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-01 10:02:01 +00:00
Luigi Dell'Aquila
58e199c840
ES|QL: Mute HeapAttackIT.testTooManyEval (#108105)
Muting because of https://github.com/elastic/elasticsearch/issues/108104
2024-04-30 13:02:50 -04:00
elasticsearchmachine
daa63d20ec Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-17 10:01:46 +00:00
Alexander Spies
1e4d4da483
ESQL: Make esql version required in REST requests (#107433)
* Enable corresponding validation in EsqlQueryRequest.
* Add the ESQL version to requests to /_query in integration tests.
* In mixed cluster tests for versions prior to 8.13.3, impersonate an 8.13
   client and do not send any version.

---------

Co-authored-by: Nik Everett <nik9000@gmail.com>
2024-04-16 17:06:09 +02:00
elasticsearchmachine
ccf04492e7 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-13 10:01:44 +00:00
Luigi Dell'Aquila
6f17d03e10
ESQL: Reduce size of HeapAttackIT.testSortByManyLongsSuccess (#107400)
Reducing the number of fields to avoid failures on CI environments where
we have much less heap (<300M for the circuit breaker).
2024-04-12 12:42:40 -04:00
ChrisHegarty
ac69c0c6cb Merge remote-tracking branch 'upstream/main' into lucene_snapshot 2024-04-12 12:07:11 +01:00
Moritz Mack
1f5e04b721
Migrate YAML REST tests to synthetic cluster feature check (#107068)
To simplify the migration away from version based skip checks in YAML specs, 
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.

New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
2024-04-11 18:22:38 +02:00
Benjamin Trent
0ca6e8d2ba
Test mute for issue #106683 (#106687)
relates: https://github.com/elastic/elasticsearch/issues/106683
2024-03-22 12:31:32 -04:00
Luigi Dell'Aquila
9c8c31952e
Refactor ESQL HeapAttackIT to run on Serverless (#106586) 2024-03-21 17:53:29 +01:00
Nhat Nguyen
d66c7d4bc8
Resume driver when failing to fetch pages (#106392)
I investigated a heap attack test failure and found that an ESQL request 
was stuck. This occurred in the following:

1. The ExchangeSource on the coordinator was blocked on reading because 
there were no available pages.

2. Meanwhile, the ExchangeSink on the data node had pages ready for 
fetching.

3. When an exchange request tried to fetch pages, it failed due to a 
CircuitBreakingException. Despite the failure, no cancellation was
triggered because the status of the ExchangeSource on the coordinator
remained unchanged.  To fix this issue, this PR introduces two changes:

Resumes the ExchangeSourceOperator and Driver on the coordinator, 
eventually allowing the coordinator to trigger cancellation of the
request when failing to fetch pages.

Ensures that an exchange sink on the data nodes fails when a data node 
request is cancelled. This callback was inadvertently omitted when
introducing the node-level reduction in Run empty reduction node level
on data nodes #106204.

I plan to spend some time to harden the exchange and compute service.

Closes #106262
2024-03-18 09:32:31 -07:00
Nhat Nguyen
c3bc2712de
Mute HeapAttackIT (#106263)
Started failing after we introduced the node-level reduction.

Tracked at #106262 Relates #106204
2024-03-12 15:20:16 -04:00
Nhat Nguyen
d6f91d9b21
Reenable heap attack tests (#105939)
I have fixed the recent failures of this suite. We should re-enable
this module and keep an eye on it.
2024-03-06 07:46:54 -08:00
Ryan Ernst
b4b32aa53a
Fix spotless - silly imports 2024-02-26 17:17:32 +01:00
Ryan Ernst
acfb500402
Mute HeapAttackIT
See https://github.com/elastic/elasticsearch/issues/105814
2024-02-26 16:59:57 +01:00
Nhat Nguyen
4857118a9e
Run heap attack tests with two nodes (#105110)
We should run heap-attack tests on multiple nodes to ensure that we 
avoid causing OOM during the serialization/deserialization of exchange
responses. I've merged the required changes and run thousands of
iterations of these tests without seeing any failures.
2024-02-14 15:24:49 -08:00
Nhat Nguyen
86c1fa2a6c
Avoid convert to string when parse resp in heap attack (#105109)
We've seen cases of OOM errors in the test runner process, which occur 
when we convert a response to a JSON string and then parse it. We can
directly parse from its input stream to avoid these OOM errors.
2024-02-05 07:16:25 -08:00
Ryan Ernst
b67f5a6b57
Make cluster feature predicate available to plugins (#105022)
A predicate to check whether the cluster supports a feature is available
to rest handlers defined in server. This commit adds that predicate to
plugins defining rest handlers as well.
2024-02-01 09:11:18 -08:00
Moritz Mack
dbf59c5414
Update/Cleanup references to old tracing.apm.* legacy settings in favor of the telemetry.* settings (#104917) 2024-01-31 09:20:05 +01:00
David Turner
1116889819
Remove unused arg from ActionType ctor (#104650)
`ActionType` represents an action which runs on the local node, there's
no need for implementations to define a `Reader<Response>`. This commit
removes the unused constructor argument.
2024-01-25 03:28:32 -05:00
Bogdan Pintea
0aeb9beb7e
Lower the number of evaluation in testTooManyEval() (#104697)
This reverts the increase in #104521, which causes the parser to
sometimes stack overflow.

Fixes #104694.
2024-01-24 08:50:02 -05:00
Iraklis Psaroudakis
9aefad826e
Mute testTooManyEval (#104695)
Relates #104694
2024-01-24 06:46:38 -05:00
Nhat Nguyen
668ae505ca
Fix request timeout in HeapAttack tests (#104336)
I noticed we're using 5 minutes for both query timeout and triggering 
the out-of-memory action in heap attack tests. This means when we're
generating the heap dump, and some ESQL tasks might get canceled because
the connection was disconnected. This PR increases the query timeout to
6 minutes instead.
2024-01-22 11:27:39 -08:00