Commit graph

8826 commits

Author SHA1 Message Date
Mary Gouseti
488951edf3
Data stream lifecycle does not record error in failure store rollover (#126229)
**Issue** The data stream lifecycle does not register correctly rollover
errors for failure store.

**Observed bahaviour** When data stream lifecycle encounters a rollover
error it records it unless it sees that the current write index of this
data stream doesn't match the source index of the request. However, the
write index check does not use the failure write index but the write
backing index, so the failure gets ignored

**Desired behaviour** When data stream lifecycle encounters a rollover
error it will check the relevant write index before it determines if it
should be recorded or not.
2025-04-04 03:44:09 +11:00
David Turner
69f9914403
Migrate tests away from S3 SDK MD5DigestCalculatingInputStream (#126099)
`S3BlobContainerRetriesTests` uses `MD5DigestCalculatingInputStream`
from the AWS v1 SDK to compute a MD5 checksum, but this feature is not
available in the v2 SDK. With this commit we remove this dependency and
compute the MD5 checksums directly instead.
2025-04-03 14:11:00 +01:00
Mary Gouseti
95257bbf07
Make data stream options multi-project aware (#126141) 2025-04-03 14:33:40 +03:00
Mary Gouseti
25050495b9
Data stream options convert to javaRestTests to yamlRestTests. (#126037)
In this PR we introduce the data stream API in the `es-rest-api` using
the feature flag feature. This enabled us to use the `yamlRestTests`
tests instead of the `javaRestTests`.
2025-04-03 01:32:54 +11:00
David Turner
15899afd26
Remove testWriteBlobWithExceptionThrownAtClosingTime (#126096)
Reverts the test added in #123505 - this is not behaviour on which we
rely any more, and it does not apply with SDKv2 anyway.
2025-04-02 09:43:04 +01:00
Nick Tindall
58c8f4abae
Upgrade to latest GCS SDK (#126087)
Upgrades google cloud SDK used by repository-gcs to com.google.cloud:google-cloud-storage-bom:2.50.0

Closes: ES-9287
2025-04-02 15:41:50 +11:00
Nick Tindall
28dd8e1bae
Make GCS HttpHandler more compliant (#126007)
- Fixed bug where 416 was being erroneously returned for zero-length blobs even with no Range header
- Fixed bug where partial upload wouldn't be completed if the last PUT included no data
- Return 206 (partial content) status when a Range header is specified
- Return an ETag on object get - BlobReadChannel uses this to ensure we fail when the blob is updated between successive chunks being fetched)
- The 416 on zero-length blobs was one of(?) the causes of #125668
2025-04-02 13:05:23 +11:00
David Turner
0d64aab4cc
Clean up request parsing in S3HttpHandler (#126034)
The `METHOD /path/components?and=query` string representation of a
request is becoming increasingly difficult to parse, with slight
variations in parsing between the implementation in `S3HttpHandler` and
the various other implementations. This commit gets rid of the
string-concatenate-and-split behaviour in favour of a proper object that
has predicates for testing all the different kinds of request that might
be made against S3.
2025-04-02 05:49:50 +11:00
Jack Conradson
24e4887748
Remember extraneous Painless code (#126057)
This removes some leftover remnants from using StringBuilder 
as part of String concatenation. Since we no longer support JDK 8, 
this code can be safely removed.
2025-04-01 11:41:54 -07:00
Keith Massey
7a9edb5d95
Adding a cleanup method to EnterpriseGeoIpDownloaderIT (#125958) 2025-03-31 14:28:14 -05:00
Sam Xiao
bddc14c232
Add multi-project support for health indicator shards_availability (#125512) 2025-03-31 11:12:52 -04:00
Keith Massey
939dc8bb8e
Re-enabling EnterpriseGeoIpDownloaderIT with verbose logging (#125884) 2025-03-31 09:47:53 -05:00
Niels Bauman
a8f5db2604
Make data stream lifecycle project-aware (#125476)
Now that all actions that DLM depends on are project-aware, we can make DLM itself project-aware.
There still exists only one instance of `DataStreamLifecycleService`, it just loops over all the projects - which matches the approach we've taken for similar scenarios thus far.
2025-03-31 14:52:43 +01:00
David Turner
6048d26990
Rename IgnoreNoResponseMetricsCollector (#125934)
Originally this metrics collector was just there to ignore API calls
that didn't make it all the way to S3, but (a) it doesn't really do that
because it also apparently ignores 4xx responses and (b) it also does a
bunch of other metrics collection too. `IgnoreNoResponseMetricsCollector`
is definitely the wrong name these days so this commit renames it to
something more general.
2025-03-31 14:32:38 +01:00
Jordan Powers
71e74bdd66
Store arrays offsets for scaled float fields natively with synthetic source (#125793)
This patch builds on the work in #113757, #122999, #124594, #125529, and 
#125709 to natively store array offsets for scaled float fields instead of
falling back to ignored source when synthetic_source_keep: arrays.
2025-03-28 20:26:29 +01:00
Nick Tindall
a25677371a
Revert "Upgrade to latest GCS SDK (#124062)" (#125748)
This reverts commit 073ca0e888.
2025-03-28 17:49:30 +11:00
Joe Gallo
d12a662a98
Use a custom cache record for EnterpriseResponse (#125809) 2025-03-27 17:15:52 -04:00
Joe Gallo
37f6ebe560
Use a custom cache record for CityResponse (#125806) 2025-03-27 16:07:40 -04:00
Carlos Delgado
968bddc462
Non existing synonyms sets do not fail shard recovery (#125659) 2025-03-27 18:04:20 +02:00
Pete Gillin
66432fb886
ES-10037 Track the peak indexing load for each shard (#125521)
This tracks the highest value seen for the recent write load metric
any time the stats for a shard was computed, exposes this value
alongside the recent value, and persists it in index metadata
alongside it too.

The new test in `IndexShardTests` is designed to more thoroughly test
the recent write load metric previously added, as well as to test the
peak metric being added here.

ES-10037 #comment Added peak load metric in https://github.com/elastic/elasticsearch/pull/125521
2025-03-27 12:03:39 +02:00
David Turner
36c14bf3a5
Validate region/service in DynamicAwsCredentials (#125671)
Following on from #125559, we can validate the region and service name
in tests that use `DynamicAwsCredentials` too.
2025-03-27 06:14:40 +00:00
Mark Vieira
5cfe3cba9d
Convert ingest-geoip file based update tests to new testing framework (#125632) 2025-03-26 10:42:45 -07:00
Joe Gallo
8857ebf95e
Refactor geoip cache for MaxMind databases (#125527) 2025-03-26 12:52:57 -04:00
Mark Vieira
d72d81a0eb
Convert remaining module projects to new test clusters framework (#125613) 2025-03-26 08:42:55 -07:00
Mary Gouseti
6503c1b94b
[Failure Store] Conceptually introduce the failure store lifecycle (#125258)
* Specify index component when retrieving lifecycle

* Add getters for the failure lifecycle

* Conceptually introduce the failure store lifecycle (even for now it's the same)
2025-03-26 13:21:48 +02:00
Niels Bauman
8b691db436
Fix data stream retrieval in ExplainDataStreamLifecycleIT (#125611)
These tests had the potential to fail when two consecutive GET data
streams requests would hit two different nodes, where one node already
had the cluster state that contained the new backing index and the other
node didn't yet.

Caused by #122852

Fixes #124882
Fixes #124885
2025-03-26 10:33:33 +00:00
Nick Tindall
073ca0e888
Upgrade to latest GCS SDK (#124062)
Upgrades google cloud SDK used by repository-gcp to com.google.cloud:google-cloud-storage-bom:2.50.0

Closes: ES-9287
2025-03-26 11:08:14 +11:00
David Turner
8d649f2f07
Validate AWS signer region and service in tests (#125559)
Extends the predicate in `AwsCredentialsUtils` to verify that we are
using a proper AWS v4 signature complete with the correct region and
service, rather than just looking for the access key as a substring.
2025-03-26 02:53:21 +11:00
Niels Bauman
542a3b65a9
Fix data stream retrieval in DataStreamLifecycleServiceIT (#125195)
These tests had the potential to fail when two consecutive GET data
streams requests would hit two different nodes, where one node already
had the cluster state that contained the new backing index and the other
node didn't yet.

Caused by #122852

Fixes #124846
Fixes #124950
Fixes #124999
2025-03-24 17:43:09 +02:00
Niels Bauman
f7d7ce7ccc
Run TransportGetDataStreamOptionsAction on local node (#125213)
This action solely needs the cluster state, it can run on any node.
Additionally, it needs to be cancellable to avoid doing unnecessary work
after a client failure or timeout.

Relates #101805
2025-03-22 16:18:28 +02:00
Niels Bauman
bbc47d9cad
Run TransportGetDataStreamLifecycleAction on local node (#125214)
This action solely needs the cluster state, it can run on any node.
Additionally, it needs to be cancellable to avoid doing unnecessary work
after a client failure or timeout.

Relates #101805
2025-03-22 13:00:47 +02:00
Armin Braun
50437e79d3
Cleanup missing use of StandardCharsets (#125424)
Random annoyance that I figured, I'd just fix globally:
We can do a bit of a cleaner job when doing byte <-> string conversion here and there.
2025-03-21 20:10:15 +01:00
David Turner
4ce1d9ce21
Cosmetic fixes to repository-s3 (#125397)
Relates AWS SDK v2 uprgade, this commit just pulls out some bits that
can go in first.
2025-03-21 13:31:48 +00:00
Mary Gouseti
2c377f9c85
Unify template builders for data stream options, failure store and data stream lifecycle (#125293) 2025-03-21 10:03:27 +02:00
Yang Wang
7a0a399055
[Test] Reconcile TestProjectResolvers (#124988)
This PR updates the different methods in TestProjectResolvers so that
their names are more accurate and behaviours to be more as expected.

For example, In MP-1749, we differentiate between single-project and
single-project only resolvers. The later should not support multi-project.
2025-03-21 11:43:05 +11:00
Joe Gallo
e210ea87d6
Add an ignoreMissing parameter to IngestDocument's removeField method (#125232) 2025-03-19 16:55:13 -04:00
Niels Bauman
8e64f50d66
Make DLM stats and DLM error store project-aware (#124810)
This is part of the work to make DLM project-aware.

These two features were pretty tightly coupled, so I saved some effort
by combining them in one PR.
2025-03-19 12:39:28 +02:00
Mikhail Berezovskiy
d9e751602d
gcp retry-test setup fix (#125181) 2025-03-18 23:23:50 -07:00
Joe Gallo
0efa6f89f6
Add some utility functions for handling Maxmind geoip results (#125153) 2025-03-18 17:05:45 -04:00
Pete Gillin
50e689493c
Calculate recent write load in indexing stats (#124652)
This uses the recently-added `ExponentiallyWeightedMovingRate` class
to calculate a write load which favours more recent load and include
this alongside the existing unweighted all-time write load in
`IndexingStats.Stats`.

As of this change, the new load metric is not used anywhere, although
it can be retrieved with the index stats or node stats APIs.
2025-03-18 21:23:20 +02:00
Lorenzo Dematté
a4d7297944
Permanently switch from SecurityManager to Entitlements (#124865) (#125117)
The JDK team has completely disabled the Java SecurityManager from Java 24. Elasticsearch has always used the Java SecurityManager as an additional protection mechanism; in order to retain this second line of defense, the Elasticsearch Core/Infra team has been working on the Entitlements project.

Similar to SecurityManager, Entitlements only allow calling specific methods in the JDK when the caller has a matching policy attached. In other words, if some code (in the main Elasticsearch codebase, in a plugin/module, or in a script) attempts to perform a "privileged" operation and it is not entitled to do so, a NotEntitledException will be thrown.

This PR includes the minimal set of changes to always use Entitlements, regardless of system properties or Java version.

Relates to ES-10921
2025-03-18 18:38:45 +02:00
Oleksandr Kolomiiets
033d28e792
Use FallbackSyntheticSourceBlockLoader for shape and geo_shape (#124927) 2025-03-18 08:49:08 -07:00
Patrick Doyle
fd51f44e32
Silence known entitlement warnings (#124883) 2025-03-18 16:52:12 +02:00
Mary Gouseti
ce04da7dea
Refactor data stream lifecycle to use the template paradigm (#124593) 2025-03-18 13:24:06 +02:00
Jack Conradson
053938a3d4
Add manage_threads entitlement for reactor.core (#125037)
This adds the manage_threads entitlement for reactor.core as part of the 
azure-repository module. It looks like this is a requirement for offloading 
azure blob store work.
2025-03-17 10:15:10 -07:00
Ignacio Vera
aba54e8af8
Don't generate stacktrace in TaskCancelledException (#125002) 2025-03-17 15:59:08 +01:00
Rene Groeschke
ae569def9c
[Build] Require reason for usesDefaultDistribution (#124707)
This makes using usesDefaultDistribution in our test setup for explicit by requiring a reason why it's needed.
This is helpful as part of revisiting the need for all those usages in our code base.
2025-03-17 08:25:39 +01:00
Armin Braun
4c1c51e870
Remove remoteAddress field from TransportResponse (#120016)
This field is only used (by security) for requests, having it in responses is redundant.
Also, we have a couple of responses that are singletons/quasi-enums where setting the value
needlessly might introduce some strange contention even though it's a plain store.

This isn't just a cosmetic change. It makes it clear at compile time that each response instance
is exclusively defined by the bytes that it is read from. This makes it easier to reason about the
validity of suggested optimizations like https://github.com/elastic/elasticsearch/pull/120010
2025-03-16 19:54:29 +01:00
Ryan Ernst
3c129e7fce
Re-enable analysis stemmer test (#124961)
This test was disabled until exclusive entitlements were added.

closes #119130
2025-03-17 02:57:36 +11:00
John Verwolf
cb3c35783b
Bug Fix: System Data Streams Should Be Restorable (#124651)
This PR adds a new MetadataDeleteDataStreamService that allows us to delete system data streams prior to a restore operation.  This fixes a bug where system data streams were previously un-restorable.
2025-03-14 08:00:44 -07:00