Commit graph

5763 commits

Author SHA1 Message Date
Nick Tindall
28dd8e1bae
Make GCS HttpHandler more compliant (#126007)
- Fixed bug where 416 was being erroneously returned for zero-length blobs even with no Range header
- Fixed bug where partial upload wouldn't be completed if the last PUT included no data
- Return 206 (partial content) status when a Range header is specified
- Return an ETag on object get - BlobReadChannel uses this to ensure we fail when the blob is updated between successive chunks being fetched)
- The 416 on zero-length blobs was one of(?) the causes of #125668
2025-04-02 13:05:23 +11:00
Oleksandr Kolomiiets
f3ccde6959
Use FallbackSyntheticSourceBlockLoader for point and geo_point (#125816) 2025-04-01 12:55:18 -07:00
David Turner
0d64aab4cc
Clean up request parsing in S3HttpHandler (#126034)
The `METHOD /path/components?and=query` string representation of a
request is becoming increasingly difficult to parse, with slight
variations in parsing between the implementation in `S3HttpHandler` and
the various other implementations. This commit gets rid of the
string-concatenate-and-split behaviour in favour of a proper object that
has predicates for testing all the different kinds of request that might
be made against S3.
2025-04-02 05:49:50 +11:00
Jordan Powers
71e74bdd66
Store arrays offsets for scaled float fields natively with synthetic source (#125793)
This patch builds on the work in #113757, #122999, #124594, #125529, and 
#125709 to natively store array offsets for scaled float fields instead of
falling back to ignored source when synthetic_source_keep: arrays.
2025-03-28 20:26:29 +01:00
Mary Gouseti
1943844d5a
Effort to fix testDataStreamLifecycleDownsampleRollingRestart #123769 (#125478) 2025-03-28 15:26:09 +02:00
David Turner
2d4fb76267
Improve randomIdentifier usage in AWS tests (#125775)
Adds prefixes to various randomly-generated values to make it easier to
pin down where they're coming from in debugging sessions. Also forces
the STS expiry time to be rendered in UTC.
2025-03-28 18:33:05 +11:00
Yang Wang
3568ab8eac
Migrate RepositoriesMetadata to ProjectCustom (#125398)
This PR migrates RepositoriesMetadata from Metadata#ClusterCustom to
Metadata#ProjectCustom and handles wire BWC.

Resolves: ES-10477
2025-03-28 17:53:17 +11:00
Nick Tindall
a25677371a
Revert "Upgrade to latest GCS SDK (#124062)" (#125748)
This reverts commit 073ca0e888.
2025-03-28 17:49:30 +11:00
Rene Groeschke
b476ee19b5
Try fixing mutedTest.yml file not found (#125763) 2025-03-27 13:31:52 +01:00
David Turner
36c14bf3a5
Validate region/service in DynamicAwsCredentials (#125671)
Following on from #125559, we can validate the region and service name
in tests that use `DynamicAwsCredentials` too.
2025-03-27 06:14:40 +00:00
Mark Vieira
cb44d7a727
Bump test cluster startup timeout back to 5 minutes 2025-03-26 16:26:20 -07:00
Mark Vieira
e149a3e10d
Convert :test projects to new testing framework (#125724) 2025-03-26 16:11:50 -07:00
Jordan Powers
689eaf20f4
Store arrays offsets for unsigned long fields natively with synthetic source (#125709)
This patch builds on the work in #113757, #122999, #124594, and #125529 to
natively store array offsets for unsigned long fields instead of falling
back to ignored source when synthetic_source_keep: arrays.
2025-03-27 00:59:24 +02:00
David Turner
60bd75d71f
Generalize S3HttpHandler request matching (#125670)
The pattern-matching in `S3HttpHandler` is overly specific and carefully
crafted to handle the exact requests that the AWS SDK v1 makes. It turns
out that the AWS SDK v2 makes requests that are slightly different. This
commit generalizes the pattern-matching to handle both SDKs.
2025-03-26 21:41:01 +00:00
Mark Vieira
930b4ab995
Convert remaining plugin projects to new test clusters framework (#125626) 2025-03-26 13:44:07 -07:00
Mark Vieira
0388a5980c
Migrate legacy QA projects to new test clusters framework (#125545) 2025-03-26 10:05:56 -07:00
Mark Vieira
d72d81a0eb
Convert remaining module projects to new test clusters framework (#125613) 2025-03-26 08:42:55 -07:00
David Turner
40095992c2
Add more addTemporaryStateListener utils (#125648)
We often call `addTemporaryStateListener` with the `ClusterService` of a
random node, or the currently elected master. This commit adds utilities
for this common pattern.
2025-03-26 21:15:18 +11:00
Rene Groeschke
4de4ec1d4c
Resolve bwc dependencies for packer cache (#125625) 2025-03-26 09:59:37 +01:00
Nick Tindall
073ca0e888
Upgrade to latest GCS SDK (#124062)
Upgrades google cloud SDK used by repository-gcp to com.google.cloud:google-cloud-storage-bom:2.50.0

Closes: ES-9287
2025-03-26 11:08:14 +11:00
Mark Vieira
65751062f7
Re-enable VerifyVersionConstantsIT (#125605) 2025-03-25 12:16:53 -07:00
David Turner
8d649f2f07
Validate AWS signer region and service in tests (#125559)
Extends the predicate in `AwsCredentialsUtils` to verify that we are
using a proper AWS v4 signature complete with the correct region and
service, rather than just looking for the access key as a substring.
2025-03-26 02:53:21 +11:00
Ievgen Degtiarenko
11fed4502c
Improve StatementParserTests error message (#125568) 2025-03-25 14:23:18 +01:00
Niels Bauman
542a3b65a9
Fix data stream retrieval in DataStreamLifecycleServiceIT (#125195)
These tests had the potential to fail when two consecutive GET data
streams requests would hit two different nodes, where one node already
had the cluster state that contained the new backing index and the other
node didn't yet.

Caused by #122852

Fixes #124846
Fixes #124950
Fixes #124999
2025-03-24 17:43:09 +02:00
Armin Braun
50437e79d3
Cleanup missing use of StandardCharsets (#125424)
Random annoyance that I figured, I'd just fix globally:
We can do a bit of a cleaner job when doing byte <-> string conversion here and there.
2025-03-21 20:10:15 +01:00
Nik Everett
e897a1422f
Aggs: Let terms run in global ords mode no match (#124782)
Allows the `terms` agg to run with global ords if the top level query
matches no docs *but* the agg is configured to collect buckets with 0
docs.
2025-03-21 13:00:25 -04:00
Armin Braun
9c8750bc8c
Stop retaining transport responses past serialization (#125163)
Remove the `OutboundMessage` class that needlessly holds on to the the response instances after they are not needed any longer. Inlining the logic should save considerably heap under pressure and enabled further optimisations.
2025-03-21 13:08:54 +01:00
Mary Gouseti
2c377f9c85
Unify template builders for data stream options, failure store and data stream lifecycle (#125293) 2025-03-21 10:03:27 +02:00
Yang Wang
7a0a399055
[Test] Reconcile TestProjectResolvers (#124988)
This PR updates the different methods in TestProjectResolvers so that
their names are more accurate and behaviours to be more as expected.

For example, In MP-1749, we differentiate between single-project and
single-project only resolvers. The later should not support multi-project.
2025-03-21 11:43:05 +11:00
David Turner
f04761c31a
Remove redundant response parameter to onResponseSent() (#125326)
Nobody uses this parameter (except some tests that simply verify the
otherwise-unused plumbing is connected). This commit removes it.

Relates #125163
2025-03-21 04:50:08 +11:00
Gal Lalouche
54240d3854
Refactor IndexFieldCapabilities creation by adding a proper builder object (#125219)
Reduce boilerplate associated with creating `IndexFieldCapabilities`
instances. Since it's a class with a large number of fields, it makes
sense to define a builder object, as that can also help with all the
Boolean and null blindness going on. As with `FieldCapabilitiesBuilder`,
this is only used in tests, to avoid any potential performance hit in
production code.
2025-03-20 11:02:35 +02:00
Jordan Powers
376abfece9
Natively store synthetic source array offsets for numeric fields (#124594)
This patch builds on the work in #122999 and #113757 to natively store
array offsets for numeric fields instead of falling back to ignored source
when `source_keep_mode: arrays`.
2025-03-19 18:44:46 -07:00
Yang Wang
a1b0ed104b
[Test] Allow configuring configDir for the Java test cluster (#125094)
For creating and deleting projects in multi-project tests, we need
create and delete settings and secrets files on the fly. This PR adds
such feature to the Java test cluster with an option to specify the
config directory.
2025-03-20 11:21:39 +11:00
Pete Gillin
50e689493c
Calculate recent write load in indexing stats (#124652)
This uses the recently-added `ExponentiallyWeightedMovingRate` class
to calculate a write load which favours more recent load and include
this alongside the existing unweighted all-time write load in
`IndexingStats.Stats`.

As of this change, the new load metric is not used anywhere, although
it can be retrieved with the index stats or node stats APIs.
2025-03-18 21:23:20 +02:00
Albert Zaharovits
fa46b873be
Threadpool merge scheduler (#120869)
This adds a new merge scheduler implementation that uses a (new)
dedicated thread pool to run the merges. This way the number of
concurrent merges is limited to the number of threads in the pool
(i.e. the number of allocated processors to the ES JVM).

It implements dynamic IO throttling (the same target IO rate for all
merges, roughly, with caveats) that's adjusted based on the number
of currently active (queued + running) merges.
Smaller merges are always preferred to larger ones, irrespective of
the index shard that they're coming from.
The implementation also supports the per-shard "max thread count"
and "max merge count" settings, the later being used today for indexing throttling.
Note that IO throttling, max merge count, and max thread count work similarly,
but not identical, to their siblings in the ConcurrentMergeScheduler.

The per-shard merge statistics are not affected, and the thread-pool statistics should
reflect the merge ones (i.e. the completed thread pool stats reflects the total
number of merges, across shards, per node).
2025-03-18 19:32:49 +02:00
Luigi Dell'Aquila
41510fc846
Fix ES|QL query log file suffix in LogType (#125124) 2025-03-18 19:16:16 +02:00
Oleksandr Kolomiiets
033d28e792
Use FallbackSyntheticSourceBlockLoader for shape and geo_shape (#124927) 2025-03-18 08:49:08 -07:00
Luigi Dell'Aquila
f3ed9b3a2d
ES|QL query log (#124094) 2025-03-18 16:31:55 +01:00
Mary Gouseti
ce04da7dea
Refactor data stream lifecycle to use the template paradigm (#124593) 2025-03-18 13:24:06 +02:00
David Turner
a2d98e44a1
Upgrade discovery-ec2 to AWS SDK v2 (#122062) 2025-03-18 19:38:16 +11:00
Rene Groeschke
ae569def9c
[Build] Require reason for usesDefaultDistribution (#124707)
This makes using usesDefaultDistribution in our test setup for explicit by requiring a reason why it's needed.
This is helpful as part of revisiting the need for all those usages in our code base.
2025-03-17 08:25:39 +01:00
Armin Braun
4c1c51e870
Remove remoteAddress field from TransportResponse (#120016)
This field is only used (by security) for requests, having it in responses is redundant.
Also, we have a couple of responses that are singletons/quasi-enums where setting the value
needlessly might introduce some strange contention even though it's a plain store.

This isn't just a cosmetic change. It makes it clear at compile time that each response instance
is exclusively defined by the bytes that it is read from. This makes it easier to reason about the
validity of suggested optimizations like https://github.com/elastic/elasticsearch/pull/120010
2025-03-16 19:54:29 +01:00
Nhat Nguyen
6b6fc8028d
Include failures in partial response (#124929)
This change includes failures when ESQL returns partial results. It also 
carries failures between cluster requests.

Relates #122802
2025-03-16 11:44:06 -07:00
Moritz Mack
36874e8663
Prevent work starvation bug if using scaling EsThreadPoolExecutor with core pool size = 0 (#124732)
When `ExecutorScalingQueue` rejects work to make the worker pool scale up while already being at max pool size (and a new worker consequently cannot be added), available workers might timeout just about at the same time as the task is then force queued by `ForceQueuePolicy`. This has caused starvation of work as observed for `masterService#updateTask` in #124667 where max pool size 1 is used. This configuration is most likely to expose the bug.

This PR changes `EsExecutors.newScaling` to not use `ExecutorScalingQueue` if max pool size is 1 (and core pool size is 0). A regular `LinkedTransferQueue` works perfectly fine in this case.

If max pool size > 1, a probing approach is used to ensure the worker pool is adequately scaled to at least 1 worker after force queueing work in `ForceQueuePolicy`.

Fixes #124667
Relates to #18613
2025-03-16 17:42:46 +01:00
Ievgen Degtiarenko
35ecbf6e87
Include node thread name in IT tests logs (#124761) 2025-03-14 10:30:19 +01:00
Mariusz Józala
b427a2bf4e
[Tests] Limit IOUtilTests on Windows (#124716)
On Windows read-only directories where files cannot be stored are not
supported. It makes this test irrelevant for this OS.
2025-03-13 21:59:23 +11:00
Mariusz Józala
4ff1aade13
[Tests] Fix copying files for test cluster (#124628)
In case when file with `.attach_pid` in name was stored in distribution
and then deleted, the exception could stop copying/linking files
without any sign of issue. The files were then missing in the cluster
used in the test causing them sometimes to fail (depending on which
files haven't been copied).

When using `Files.walk` it is impossible to catch the IOException and
continue walking through files conditionally. It has been replaced with
FileVisitor implementation to be able to continue if the exception is
caused by files left temporarily by JVM but no longer available.
2025-03-12 16:09:55 +01:00
Nik Everett
50aaa1c2a6
ESQL: Pragma to load from stored fields (#122891)
This creates a `pragma` you can use to request that fields load from a
stored field rather than doc values. It implements that pragma for
`keyword` and number fields.

We expect that, for some disk configuration and some number of fields,
that it's faster to load those fields from _source or stored fields than
it is to use doc values. Our default is doc values and on my laptop it's
*always* faster to use doc values. But we don't ship my laptop to every
cluster.

This will let us experiment and debug slow queries by trying to load
fields a different way.

You access this pragma with:
```
curl -HContent-Type:application/json -XPOST localhost:9200/_query?pretty -d '{
    "query": "FROM foo",
    "pragma": {
        "field_extract_preference": "STORED"
    }
}'
```

On a release build you'll need to add `"accept_pragma_risks": true`.
2025-03-12 09:40:42 -04:00
Yang Wang
207c2df14c
[Test] Move helper mthods for multi-project rest test (#124285)
This PR moves the helper methods up to the base ESRestTestCase class so
that they can be reused by other subclasses, e.g. the ones on the
serverless side.

Relates: ES-10292
2025-03-11 12:53:10 +11:00
Luca Cavanna
def4c890bc
Fix concurrency issue in ScriptSortBuilder (#123757)
Inter-segment concurrency is disabled whenever sort by field, included script sorting, is used in a search request.

The reason why sort by field does not use concurrency is that there are some performance implications, given that the hit queue in Lucene is build per slice and the different search threads don't share information about the documents they have already visited etc.

The reason why script sort has concurrency disabled is that the script sorting implementation is not thread safe. This commit addresses such concurrency issue and re-enables search concurrency for search requests that use script sorting. In addition, missing tests are added to cover for sort scripts that rely on _score being available and top_hits aggregation with a scripted sort clause.
2025-03-10 21:10:53 +01:00