elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 17:34:17 -04:00

Author	SHA1	Message	Date
Ryan Ernst	05d18a2981	Remove SecurityManager code from ingest attachment (#127291 ) Now that SecurityManager is gone, there is no longer a need for a specialized access control context for interacting with tika.	2025-04-24 06:22:10 -07:00
David Turner	b028c0af56	Upgrade `repository-s3` to AWS SDK v2 (#126843 ) Closes #120993	2025-04-24 21:21:03 +10:00
Keith Massey	ee2d2f313d	Adding settings to data streams (#126947 )	2025-04-23 13:27:40 -05:00
David Turner	85a87e71d6	Add end-to-end bulk splitting test (#127237 ) Today we do not have a test that verifies the Netty HTTP pipeline interacts properly with the incremental bulk handling service and splits requests when the watermark is hit. This commit adds such a test.	2025-04-24 01:42:38 +10:00
Oleksandr Kolomiiets	5e2b199b94	[TEST] Move test data generation out of logsdb namespace (#119994 )	2025-04-23 08:29:32 -07:00
Mary Gouseti	db2992f0f8	[Failure Store] Expose failure store lifecycle information via the `GET` data stream API (#126668 ) To retrieve the effective configuration you need to use the `GET` data streams API, for example, if a data stream has empty data stream options, it might still have failure store enabled from a cluster setting. The failure store is managed by default with a lifecycle with infinite (for now) retention, so the response will look like this: ``` GET _data_stream/* { "data_streams": [ { "name": "my-data-stream", "timestamp_field": { "name": "@timestamp" }, ..... "failure_store": { "enabled": true, "lifecycle": { "enabled": true }, "rollover_on_write": false, "indices": [ { "index_name": ".fs-my-data-stream-2099.03.08-000003", "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw", "managed_by": "Data stream lifecycle" } ] } },... ] ``` In case there is a failure indexed managed by ILM the failure index info will be displayed as follows. ``` { "index_name": ".fs-my-data-stream-2099.03.08-000002", "index_uuid": "PA_JquKGSiKcAKBA8DJ5gw", "prefer_ilm": true, "ilm_policy": "my-lifecycle-policy", "managed_by": "Index Lifecycle Management" } ```	2025-04-23 23:44:46 +10:00
Niels Bauman	4207cee3eb	Rename data stream transport actions (#127222 ) The new action names are more consistent with the rest of the codebase.	2025-04-23 12:40:38 +02:00
Mary Gouseti	b9917086e1	Create dedicated factory methods for data lifecycle (#126487 ) The class `DataStreamLifecycle` is currently capturing the lifecycle configuration that currently manages all data stream indices, but soon enough it will be split into two variants, the data and the failures lifecycle. Some pre-work has been done already but as we are progressing in our POC, we see that it will be really useful if the `DataStreamLifecycle` is "aware" of the target index component. This will allow us to correctly apply global retention or to throw an error if a downsampling configuration is provided to a failure lifecycle. In this PR, we perform a small refactoring to reduce the noise in https://github.com/elastic/elasticsearch/pull/125658. Here we introduce the following: - A factory method that creates a data lifecycle, for now it's trivial but it will be more useful soon. - We rename the "empty" builder to explicitly mention the index component it refers to.	2025-04-23 20:00:25 +10:00
David Turner	21813604b4	Skip listing MPUs if TTL set to -1 (#127166 ) Recent versions of MinIO will sometimes leak multi-part uploads under concurrent load, leaving them in the `ListMultipartUploads` output even though they cannot be aborted. Today this causes repository analysis to fail since compare-and-exchange operations will not even start if there are any pre-existing uploads. This commit makes it possible to skip this pre-flight check (and accept the performance consequences) by adjusting the relevant settings. Workaround for minio/minio#21189 Closes #122670	2025-04-23 06:33:40 +01:00
Armin Braun	cd609533bf	Fix duplicate strings in SearchHit serialization (#127180 ) The map key is always the field name. We exploited this fact in the get results but not in search hits, leading to a lot of duplicate strings in many heap dumps. We could do much better here since the names are generally coming out of a know limited set of names, but as a first step lets at least align the get- and search-responses and non-trivial amount of bytes in a number of use-cases. Plus, having a single string instance is faster on lookup etc. and saves on CPU also.	2025-04-22 22:43:27 +02:00
David Turner	f9d813a443	Improve `Netty4IncrementalRequestHandlingIT` (#127111 ) * Verifies that each call to `Netty4HttpRequestBodyStream#next` yields exactly one chunk (or the stream is closed) since the `IncrementalBulkService` relies on this property. * Replaces several busy-waits with ones that block on a future for faster test execution. * Replaces several hard-coded constants with randomized values to clarify that the precise value does not matter to the test. * Reduces the use of unnecessary abbreviations in names. * Reduce the use of global static state in favour of node-local components.	2025-04-22 21:37:32 +10:00
James Baiera	7b89f4d4a6	Add ability to redirect ingestion failures on data streams to a failure store (#126973 ) Removes the feature flags and guards that prevent the new failure store functionality from operating in production runtimes.	2025-04-18 16:33:03 -04:00
James Baiera	d928d1a418	Add node feature for failure store, refactor capability names (#126885 ) Adds a node feature that is conditionally added to the cluster state if the failure store feature flag is enabled. Requires all nodes in the cluster to have the node feature present in order to redirect failed documents to the failure store from the ingest node or from shard level bulk failures.	2025-04-18 13:42:48 -04:00
Armin Braun	f461f90d48	Remove redundant marker interfaces that extend Bucket (#127038 ) No need to have these marker interfaces around when weäre not using them anywhere, all they do is hide a lot of code duplication actually. Removing them sets up the possible removal of hundreds of lines of downstream code it seems	2025-04-18 18:26:39 +02:00
Joe Gallo	b46bee4e47	Correctly handle non-integers in nested paths in the remove processor (#127006 )	2025-04-18 11:46:54 -04:00
Lorenzo Dematté	69f6520b0c	[Entitlements] Validation checks on paths (#126852 ) With this PR we restrict the paths we allow access to, forbidding plugins to specify/request entitlements for reading or writing to specific protected directories. I added this validation to EntitlementInitialization, as I wanted to fail fast and this is the earliest occurrence where we have all we need: PathLookup to resolve relative paths, policies (for plugins, server, agents) and the Paths for the specific directories we want to protect. Relates to ES-10918	2025-04-18 15:36:07 +02:00
elasticsearchmachine	36af046441	Merge patch/serverless-fix into main	2025-04-18 04:30:44 +00:00
Brian Seeders	2a243d8492	Revert #126441 Add flow-control and remove auto-read in netty4 HTTP pipeline (#127030 ) * Revert "Release buffers in netty test (#126744)" This reverts commit `f9f3defe92`. * Revert "Add flow-control and remove auto-read in netty4 HTTP pipeline (#126441)" This reverts commit `c8805b85d2`.	2025-04-17 12:37:26 -07:00
Nick Tindall	d378185054	Fix GCS tests broken by idempotency token (#126972 )	2025-04-17 04:42:32 +02:00
Nick Tindall	17c6e10846	GCS: Use idempotency token to identify requests (#126887 )	2025-04-16 15:56:47 +10:00
Mikhail Berezovskiy	5a7a425bd0	Refactor GCS fixture multipart parser (#125828 )	2025-04-15 10:09:53 -07:00
David Turner	aa40147142	Add integ tests for `ftp://` URL repository (#126757 ) We document support for snapshot repositories using `ftp://` URLs but it seems this functionality has not worked for many years because of security-manager restrictions, although nobody noticed because it was not covered by any tests. The migration to the Entitlements framework means that this functionality now works again, so this commit adds tests to make sure we do not break it again in future.	2025-04-15 12:57:00 +01:00
Ryan Ernst	83ce15ae06	Make TransportRequest an interface (#126733 ) In order to support a future TransportRequest variant that accepts the response type, TransportRequest needs to be an interface. This commit adds AbstractTransportRequest as a concrete implementation and makes TransportRequest a simple interface that joints together the parent interfaces from TransportMessage. Note that this was done entirely in Intellij using structural find and replace.	2025-04-14 14:22:28 -07:00
Mikhail Berezovskiy	f9f3defe92	Release buffers in netty test (#126744 )	2025-04-14 13:09:12 -07:00
Brendan Cully	d02b65308e	S3BlobContainer: Revert broadened exception handler (#126731 ) Catching Exception instead of AmazonClientException in copyBlob and executeMultipart led to failures in S3RepositoryAnalysisRestIT due to the injected exceptions getting wrapped in IOExceptions that prevented them from being caught and handled in BlobAnalyzeAction. Closes #126576	2025-04-14 19:20:11 +02:00
Ignacio Vera	ffdfcec334	Upgrade to Lucene 10.2.0 (#126594 ) This commit upgrade Elasticsearch to lucene 10.2.0	2025-04-14 13:50:52 +02:00
Mikhail Berezovskiy	c8805b85d2	Add flow-control and remove auto-read in netty4 HTTP pipeline (#126441 )	2025-04-11 14:54:22 -07:00
Jack Conradson	c1ecafad6a	Fix painless return type cast for list shortcut (#126724 ) This fixes an issue where if a Painless getter method return type didn't match a Java getter method return type we add a cast. Currentlythis is adding an extraneous cast. Closes: #70682	2025-04-11 13:50:19 -07:00
Martijn van Groningen	6012590929	Improve resiliency of UpdateTimeSeriesRangeService (#126637 ) If updating the `index.time_series.end_time` fails for one data stream, then UpdateTimeSeriesRangeService should continue updating this setting for other data streams. The following error was observed in the wild: ``` [2025-04-07T08:50:39,698][WARN ][o.e.d.UpdateTimeSeriesRangeService] [node-01] failed to update tsdb data stream end times java.lang.IllegalArgumentException: [index.time_series.end_time] requires [index.mode=time_series] at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:636) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:619) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.common.settings.Setting.get(Setting.java:563) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.common.settings.Setting.get(Setting.java:535) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.datastreams.UpdateTimeSeriesRangeService.updateTimeSeriesTemporalRange(UpdateTimeSeriesRangeService.java:111) ~[?:?] at org.elasticsearch.datastreams.UpdateTimeSeriesRangeService$UpdateTimeSeriesExecutor.execute(UpdateTimeSeriesRangeService.java:210) ~[?:?] at org.elasticsearch.cluster.service.MasterService.innerExecuteTasks(MasterService.java:1075) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:1038) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.cluster.service.MasterService.executeAndPublishBatch(MasterService.java:245) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.lambda$run$2(MasterService.java:1691) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.action.ActionListener.run(ActionListener.java:452) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.run(MasterService.java:1688) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.cluster.service.MasterService$5.lambda$doRun$0(MasterService.java:1283) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.action.ActionListener.run(ActionListener.java:452) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.cluster.service.MasterService$5.doRun(MasterService.java:1262) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023) ~[elasticsearch-8.17.3.jar:?] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[elasticsearch-8.17.3.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?] at java.lang.Thread.run(Thread.java:1575) ~[?:?] ``` Which resulted in a situation, that causes the `index.time_series.end_time` index setting not being updated for any data stream. This then caused data loss as metrics couldn't be indexed, because no suitable backing index could be resolved: ``` the document timestamp [2025-03-26T15:26:10.000Z] is outside of ranges of currently writable indices [[2025-01-31T07:22:43.000Z,2025-02-15T07:24:06.000Z][2025-02-15T07:24:06.000Z,2025-03-02T07:34:07.000Z][2025-03-02T07:34:07.000Z,2025-03-10T12:45:37.000Z][2025-03-10T12:45:37.000Z,2025-03-10T14:30:37.000Z][2025-03-10T14:30:37.000Z,2025-03-25T12:50:40.000Z][2025-03-25T12:50:40.000Z,2025-03-25T14:35:40.000Z ```	2025-04-11 12:58:10 +02:00
Armin Braun	dd1db5031e	Move calls to FeatureFlag.enabled to class-load time (#125885 ) I noticed that we tend to create the flag instance and call this method everywhere. This doesn't compile the same way as a real boolean constant unless you're running with `-XX:+TrustFinalNonStaticFields`. For most of the code spots changed here that's irrelevant but at least the usage in the mapper parsing code is a little hot and gets a small speedup from this potentially. Also we're simply wasting some bytes for the static footprint of ES by using the `FeatureFlag` indirection instead of just a boolean.	2025-04-11 01:46:28 +02:00
David Turner	b10b35fccd	Fix `S3RepositoryAnalysisRestIT` (#126593 ) - Translate a 404 during a multipart copy into a `FileNotFoundException` - Use multiple threads in `S3HttpHandler` to avoid `CopyObject`/`PutObject` deadlock Closes #126576	2025-04-11 05:41:20 +10:00
Mary Gouseti	78ac5d58ef	[Failure store] Support failure store for system data streams (#126585 ) In this PR we add support for the failure store for system data streams. Specifically: - We pass the system descriptor so the failure index can be created based on that. - We extend the tests to ensure it works - We remove a guard we had but I wasn't able to test it because it only gets triggered if the data stream gets created right after a failure in the ingest pipeline, and I didn't see how to add one (yet). - We extend the system data stream migration to ensure this is also working.	2025-04-11 05:14:11 +10:00
Jack Conradson	3d54cc3e52	Add leniency to missing array values in mustache (#126550 ) In mustache, this change returns null values which convert to empty strings instead of throwing an exception when users have a template with something like a.8 where the index 8 is out of bounds. This matches the behavior for non-existent keys like a.d. Closes #55200	2025-04-09 14:51:26 -07:00
Brendan Cully	c1a71ff45c	BlobContainer: add copyBlob method (#125737 ) * BlobContainer: add copyBlob method If a container implements copyBlob, then the copy is performed by the store, without client-side IO. If the store does not provide a copy operation then the default implementation throws UnsupportedOperationException. This change provides implementations for the FS and S3 blob containers. More will follow. Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: David Turner <david.turner@elastic.co>	2025-04-09 10:33:01 -07:00
Alexey Ivanov	ecf9adfc78	[main] System data streams are not being upgraded in the feature migration API (#126409 ) This commit adds support for system data streams reindexing. The system data stream migration extends the existing system indices migration task and uses the data stream reindex API. The system index migration task starts a reindex data stream task and tracks its status every second. Only one system index or system data stream is migrated at a time. If a data stream migration fails, the entire system index migration task will also fail. Port of #123926	2025-04-08 20:42:58 +02:00
David Turner	aab40b1247	Introduce `TestBlobContainerBuilder` (#126445 ) The mostly-optional parameters to `createBlobContainer` are getting rather numerous in this test harness which makes the tests hard to read. This commit introduces a builder to help name the provided parameters and skip the omitted ones.	2025-04-09 01:52:16 +10:00
Joe Gallo	450516d675	Fix a RemoveProcessor test that never ran (#126464 )	2025-04-08 11:21:04 -04:00
Dianna Hohensee	4b2867a0ef	Support maxConnections override in AbstractBlobContainerRetriesTestCase tests (#126435 )	2025-04-08 09:55:01 -04:00
Mary Gouseti	060a9b746a	[DLM]Use default lifecycle instance instead of default constructor (#126461 ) When creating the an empty lifecycle we used to use the default constructor. This is not just for efficiency but it will allow us to separate the default data and failures lifecycle in the future.	2025-04-08 23:37:30 +10:00
Ryan Ernst	991e80d56e	Remove unnecessary generic params from action classes (#126364 ) Transport actions have associated request and response classes. However, the base type restrictions are not necessary to duplicate when creating a map of transport actions. Relatedly, the ActionHandler class doesn't actually need strongly typed action type and classes since they are lost when shoved into the node client map. This commit removes these type restrictions and generic parameters.	2025-04-07 16:22:56 -07:00
Joe Gallo	bead858ccd	Correctly handle nulls in nested paths in the remove processor (#126417 )	2025-04-07 16:54:07 -04:00
David Turner	fbbbdd7eec	Allow overriding blob container path in tests (#126391 ) Some `AbstractBlobContainerRetriesTestCase#createBlobContainer` implementations choose a path for the container randomly, but we have a need for a test which re-creates the same container against a different `S3Service` and `BlobStore` and must therefore specify the same path each time. This commit exposes a parameter that lets callers specify a container path.	2025-04-08 03:54:37 +10:00
Mary Gouseti	a525b3d924	Fix test to anticipate force merge failure (#126282 ) This test had a copy paste mistake. When the cluster has only one data node the replicas cannot be assigned so we end up with a force merge error. In the case of the failure store this was not asserted correctly. On the other hand, this test only checked for the existence of an error and it was not ensuring that the current error is not the rollover error that should have recovered. We make this test a bit more explicit. Fixes: https://github.com/elastic/elasticsearch/issues/126252	2025-04-05 05:26:58 +11:00
Alexey Ivanov	fd7efe587e	[main] Move system indices migration to migrate plugin (#125437 ) * [main] Move system indices migration to migrate plugin It seems the best way to fix #122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin. Port of #123551 * [CI] Auto commit changes from spotless * Fix compilation * Fix tests * Fix test --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-04-04 18:49:38 +01:00
David Turner	7239540c91	Replace `region` with `regionSupplier` in all AWS tests (#126285 ) Rather than hard-coding a region name we should always auto-generate it randomly during test execution. This commit replaces the remaining fixed `String` arguments with a `Supplier<String>` argument to enable this.	2025-04-05 02:27:28 +11:00
David Turner	3e35900b07	Add missing test security policies (#126309 ) Relates #126274 Closes #126301 Closes #126302 Closes #126303 Closes #126304 Closes #126305 Closes #126306	2025-04-05 02:27:17 +11:00
David Turner	7402dfdf65	Introduce `qa` subprojects of `:modules:repository-s3` (#126274 ) Today we have some special-case test classes in `:modules:repository-s3` within the same source root as the regular tests, with some trickery to define separate Gradle tasks to run them with their special-case configs. This commit simplifies the build by just moving each of these classes into its own Gradle project.	2025-04-04 21:29:05 +11:00
David Turner	896598570c	Reinstate `S3SearchableSnapshotsCredentialsReloadIT` in FIPS JVMs (#126109 ) These tests only don't work in a FIPS JVM because they use a secret key that is unacceptably short. This commit replaces the relevant uses of `randomIdentifier` with `randomSecretKey` so they work whether in FIPS mode or not.	2025-04-04 18:42:09 +11:00
David Turner	7eee6502de	Misc cleanups in `S3BlobContainerRetriesTests` (#126101 ) - Simplify multi-object-delete request detection - Replace `AtomicBoolean` with volatile field - Make `ThrottlingDeleteHandler` static	2025-04-04 18:39:51 +11:00
Mikhail Berezovskiy	70654a3633	Add GCS telemtry with ThreadLocal (#125452 )	2025-04-03 23:46:06 -07:00

1 2 3 4 5 ...

8978 commits