elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 17:34:17 -04:00

Author	SHA1	Message	Date
Tim Brooks	c5caf84e2d	Move raw path into HttpPreRequest (#113231 ) Currently, the raw path is only available from the RestRequest. This makes the logic to determine if a handler supports streaming more challenging to evaluate. This commit moves the raw path into pre request to allow easier streaming support logic.	2024-09-21 05:32:45 +10:00
Mary Gouseti	f4f075a2cc	Add failure store status in index response of data streams (#112816 ) The failure store status is a flag that indicates how the failure store was used or could be used if enabled. The user can be informed about the usage of the failure store in the following way: When relevant we add the optional field `failure_store` . The field will be omitted when the use of the failure store is not relevant. For example, if a document was successfully indexed in a data stream, if a failure concerns an index or if the opType is not index or create. In more detail: - when we have a “success” create/index response, the field `failure_store` will not be present if the documented was indexed in a backing index. Otherwise, if it got stored in the failure store it will have the value `used`. - when we have a “rejected“ create/index response, meaning the document was not persisted in elasticsearch, we return the field `failure_store` which is either `not_enabled`, if the document could have ended up in the failure store if it was enabled, or `failed` if something went wrong and the document was not persisted in the failure store, for example, the cluster is out of space and in read-only mode. We chose to make it an optional field to reduce the impact of this field on a bulk response. The value will exist in the java object but it will not be returned to the user. The only values that will be displayed are: - `used`: meaning this document was indexed in the failure store - `not_enabled`: meaning this document was rejected but could have been stored in the failure store if it was applicable. - `failed`: meaning this failed document, failed to be stored in the failure store. Example: ``` "errors": true, "took": 202, "items": [ { "create": { "_index": ".fs-my-ds-2024.09.04-000002", "_id": "iRDDvJEB_J3Inuia2zgH", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 6, "_primary_term": 1, "status": 201, "failure_store": "used" } }, { "create": { "_index": "ds-no-fs", "_id": "hxDDvJEB_J3Inuia2jj3", "status": 400, "error": { "type": "document_parsing_exception", "reason": "[1:153] failed to parse field [count] of type [long] in document with id 'hxDDvJEB_J3Inuia2jj3'. Preview of field's value: 'bla'", "caused_by": { "type": "illegal_argument_exception", "reason": "For input string: \"bla\"" } } }, "failure_store": "not_enabled" }, { "create": { "_index": ".ds-my-ds-2024.09.04-000001", "_id": "iBDDvJEB_J3Inuia2jj3", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 7, "_primary_term": 1, "status": 201 } } ] ```	2024-09-20 10:53:39 +03:00
Mikhail Berezovskiy	cbe7ea0718	Unmute channel when flush last http stream chunk (#113222 )	2024-09-19 15:48:13 -07:00
Lee Hinman	4f221bb4c6	Mark data streams stats API as internal-only (again) (#112712 ) This is a redo of https://github.com/elastic/elasticsearch/pull/108745 which was reverted. Now that https://github.com/elastic/elasticsearch/pull/112303 has been merged, there is an alternative to retrieve the `maximum_timestamp`.	2024-09-19 13:24:02 -06:00
David Turner	33a366a256	Add extra context to `TransportNodesAction` invocations (#113140 ) Several `TransportNodesAction` implementations do some kind of top-level computation in addition to fanning out requests to individual nodes. Today they all have to do this once the node-level fanout is complete, but in most cases the top-level computation can happen in parallel with the fanout. This commit adds support for an additional `ActionContext` object, created when starting to process the request and exposed to `newResponseAsync()` at the end, to allow this parallelization. All implementations use `(Void) null` for this param, except for `TransportClusterStatsAction` which now parallelizes the computation of the cluster-state-based stats with the node-level fanout.	2024-09-19 17:33:38 +01:00
Kostas Krikellas	e244216c0f	Configure keeping source in FieldMapper (#112706 ) Introduces per-field param `synthetic_source_keep` that overrides the behavior for keeping the field source in synthetic source mode: - `none` : no source is stored - `arrays`: the incoming source is recorded as-is for arrays of a given field - `all`: the incoming source is recorded as is for both singleton and array values of a given field Related to #112012	2024-09-19 23:29:09 +10:00
Kostas Krikellas	4ff4384550	Retrieve the source for objects and arrays within arrays in a separate parsing phase (#113027 ) In synthetic source, storing array elements to `_ignored_source` may hide other, regular elements from showing up during source synthesizing. This is due to contents from `_ignored_source` taking precedence over matching fields from regular source loading. To avoid this, arrays are pre-emptively tracked and marked for source storing, if any of their elements needs to store its source. A second doc parsing phase is introduced that checks for fields missing values and records their source, while skipping objects and arrays that don't contain any such fields. Fixes #112374	2024-09-19 20:07:31 +10:00
Mikhail Berezovskiy	8e9e6532fe	Release netty ByteBufs in Netty4HttpRequestBodyStreamTests (#113161 )	2024-09-19 00:05:36 -07:00
Tim Brooks	3e6acdd48f	Merge branch 'partial-rest-requests-rebase'	2024-09-18 16:27:52 -06:00
Tim Brooks	529d349a25	Fix spotless in netty stream class Spotless broke during a rebase. Fixing in this commit.	2024-09-18 13:59:12 -06:00
Mikhail Berezovskiy	dce8a0bfd3	merge main	2024-09-18 13:52:10 -06:00
Tim Brooks	58e3a39392	Ensure partial bulks released if channel closes (#112724 ) Currently, the entire close pipeline is not hooked up in case of a channel close while a request is being buffered or executed. This commit resolves the issue by adding a connection to a stream closure.	2024-09-18 13:52:09 -06:00
Tim Brooks	2dbbd7dd45	Ensure http content copied for safe buffers (#112767 ) Currently, unless a rest handler specifies that it handles "unsafe" buffers, we must copy the http buffers in releaseAndCopy. Unfortuantely, the original content was slipping through in the initial stream PR. This less to memory corruption on index and update requests which depend on buffers being copied.	2024-09-18 13:52:09 -06:00
Mikhail Berezovskiy	0d55dc6de4	fix leaking listener (#112629 )	2024-09-18 13:51:56 -06:00
Joe Gallo	4d50ab3770	Rework close and shutdown for the geoip processor (#113138 )	2024-09-18 15:46:36 -04:00
Tim Brooks	ce2d648d8e	Reduce autoread changes in header validator (#112608 ) The header validator is very aggressive about adjusting autoread on the belief it is the only place where autoread is tweaked. However, with stream backpressure, we should only change it when we are starting or finishing header validation.	2024-09-18 13:40:39 -06:00
Tim Brooks	95b42a7129	Ensure incremental bulk setting is set atomically (#112479 ) Currently the rest.incremental_bulk is read in two different places. This means that it will be employed in two steps introducing unpredictable behavior. This commit ensures that it is only read in a single place.	2024-09-18 13:40:39 -06:00
Tim Brooks	a03fb12b09	Incremental bulk integration with rest layer (#112154 ) Integrate the incremental bulks into RestBulkAction	2024-09-18 13:40:39 -06:00
Mikhail Berezovskiy	cbcbc34863	release stream chunk queue on bad request (#112227 )	2024-09-18 13:40:39 -06:00
Mikhail Berezovskiy	1b77421cf8	handle 100-continue and oversized streaming request (#112179 )	2024-09-18 13:40:39 -06:00
Tim Brooks	478baf1459	Allow incremental bulk request execution (#111865 ) Allow a single bulk request to be passed to Elasticsearch in multiple parts. Once a certain memory threshold or number of operations have been received, the request can be split and submitted for processing.	2024-09-18 13:40:37 -06:00
Mikhail Berezovskiy	5e1f6554a2	Add http request content stream support (#111438 )	2024-09-18 13:38:36 -06:00
Lee Hinman	b94720dca5	Deprecate dot-prefixed indices and composable template index patterns (#112571 ) This commit adds a module emitting a deprecation warning when a dot-prefixed index is manually or automatically created, or when a composable index template with an index pattern that uses a dot-prefix is created. This pattern warns that in the future these indices will not be allowed. In a future breaking change (10.0.0 maybe?) the deprecation can then be changed to an exception. These deprecations are only displayed when a non-operator user is using the API (one that does not set the `X-elastic-product-origin` header).	2024-09-19 05:29:53 +10:00
David Turner	079d680319	Revert "Add extra context to `TransportNodesAction` invocations (#113086 )" This reverts commit `3fdc8ef554`.	2024-09-18 19:28:38 +01:00
David Turner	3fdc8ef554	Add extra context to `TransportNodesAction` invocations (#113086 ) Several `TransportNodesAction` implementations do some kind of top-level computation in addition to fanning out requests to individual nodes. Today they all have to do this once the node-level fanout is complete, but in most cases the top-level computation can happen in parallel with the fanout. This commit adds support for an additional `ActionContext` object, created when starting to process the request and exposed to `newResponseAsync()` at the end, to allow this parallelization. All implementations use `(Void) null` for this param, except for `TransportClusterStatsAction` which now parallelizes the computation of the cluster-state-based stats with the node-level fanout.	2024-09-18 19:07:26 +01:00
Armin Braun	90e343cfef	Use ChannelFutureListener in Netty code to reduce capturing lambdas (#112967 ) Mainly motivated by simplifying the reference chains for Netty buffers and have easier to analyze heap dumps in some spots but also a small performance win in and of itself.	2024-09-18 18:32:04 +02:00
Armin Braun	e5bcb0c5b3	Remove duplication in settings code and some minor setting speedups (#112897 ) Some small speedups in here from pre-evaluating `isFiltered(properties)` in lots of spots and not creating an unused `SimpleKey` in `toConcreteKey` which runs a costly string interning at some rate. Other than that, obvious deduplication using existing utilities or adding obvious missing overloads for them.	2024-09-18 15:01:49 +02:00
Pete Gillin	81041b47d4	[TEST] Assert DSL merge policy respects end date (#113038 ) [TEST] Assert DSL merge policy respects end date Backing indexes with an end date in the future may still get writes, so DSL should not apply the merge policy (first configuring the settings on the index, then doing the force merge) until that time has passed. The implementation already does this, because `DataStreamLifecycleService.run()` calls `timeSeriesIndicesStillWithinTimeBounds` and adds the resulting indices to `indicesToExcludeForRemainingRun` before calling `maybeExecuteForceMerge`. This change simply adds a unit test to ensure that this behaviour does not regress. Closes #109030	2024-09-18 12:03:49 +01:00
Luca Cavanna	ef37511f0a	Remove deprecations and 7.x related code from analysis common (#113009 ) edgeNGram and NGram tokenizers and token filters were deprecated. They have not been supported in indices created from 8.0, hence their support can entirely be removed from main. The version related logic around the min grams can also be removed as it refers to 7.x which we no longer need to support. Relates to #50376, #50862, #43568	2024-09-18 09:03:08 +02:00
Nhat Nguyen	af7ed9515f	Enable ignore_malformed in logsdb (#113072 ) This change enables ignore_malformed by default for newly created logsdb indices. Closes #106822	2024-09-17 22:41:41 -07:00
Joe Gallo	ab4c0276d0	Make the GeoIpCache more generic (#113053 )	2024-09-17 21:09:37 -04:00
Joe Gallo	e952b76fd6	There's no need to BufferedInputStream within a GZIPInputStream (#113052 )	2024-09-17 19:06:50 -04:00
Joe Gallo	ea896f90e7	Rework interfaces for the geoip processor (#113045 )	2024-09-17 15:33:11 -04:00
Lee Hinman	4a0ccbf4b4	Fix verbose get data stream API not requiring extra privileges (#112973 ) * Fix verbose get data stream API not requiring extra privileges When a user uses the `GET /_data_stream?verbose` API to retrieve the verbose version of the response (which includes the `maximum_timestamp`, as added in #112303), the response object should be performed with the same privilege-checking as the get-data-stream API, meaning that no extra priveleges should be required return the field. This commit makes the Transport action use an entitled client so that extra privileges are not required, and adds a test to ensure that it works. * Update docs/changelog/112973.yaml	2024-09-17 10:03:44 -06:00
Salvatore Campagna	f7880ae85f	LogsDB data migration integration testing (#112710 ) Here we test reindexing logsdb indices, creating and restoring snapshots. Note that logsdb uses synthetic source and restoring source only snapshots fails due to missing _source.	2024-09-17 16:26:48 +02:00
Iraklis Psaroudakis	32937109ac	Support writeAtomicBlob from InputStream for repository blob container interface (#112754 ) Mostly for fs and hdfs repos, similar to how writeAtomicBlob from bytes is implemented (write temp file and rename atomically). Relates ES-9248	2024-09-17 16:08:51 +03:00
Joe Gallo	5efa212db8	Tidy some assertions and code in the GeoIpProcessorTests (#112971 )	2024-09-16 18:08:26 -04:00
Oleksandr Kolomiiets	7923870b42	Fix license headers in test files (#112965 )	2024-09-16 13:45:33 -07:00
Oleksandr Kolomiiets	9de285e0f1	Add LogsDB challenge tests for reindexing (#112849 )	2024-09-16 13:32:40 -07:00
Joe Gallo	e47365f413	Fix getDatabaseType for unusual MMDBs (#112888 )	2024-09-16 14:14:34 -04:00
David Turner	3a8835853f	Remove unnecessary bwc from get-aliases API (#112797 ) Dropping support for pre-8.12 requests from remote nodes, and also cleaning up some unnecessary abstraction in the request builder hierarchy. Relates #101815 Relates #107984 (drops some unnecessary trappy timeouts)	2024-09-16 06:31:37 +01:00
David Turner	16dcaef3db	Clean up master timeouts bwc in v9 (#112790 ) All nodes in a cluster involving v9 nodes will understand that `MINUS_ONE` means an infinite master-node timeout, so there's no need to fall back to `MAX_VALUE` when talking to older nodes. This commit removes the unnecessary bwc code.	2024-09-16 06:31:23 +01:00
Armin Braun	7c6b493829	Remove unused painless package (2 unused classes) (#112898 ) It's in the title, this is simply unused code.	2024-09-16 00:30:54 +10:00
Mark Vieira	a59c182f9f	Add AGPLv3 as a supported license	2024-09-13 15:29:46 -07:00
Kostas Krikellas	86a88d735f	Fix synthetic source field names for multi-fields (#112850 ) * Fix synthetic source field names for multi-fields * enable logsdb in randomized tests * Revert "enable logsdb in randomized tests" This reverts commit `2e2c22e2bb`. * Update docs/changelog/112850.yaml * fix	2024-09-13 15:00:55 +03:00
Quentin Pradet	d18fc4c563	Remove unused parameter in char filter YAML test (#112852 )	2024-09-13 12:50:23 +04:00
Oleksandr Kolomiiets	44c9271562	Ensure that fields copied using copy_to are not present in synthetic source (#112625 )	2024-09-12 12:18:25 -07:00
Simon Cooper	0d8042e98e	Update last few references in yaml tests from ROOT locale to ENGLISH (#112791 )	2024-09-12 11:36:42 +01:00
David Turner	8607d40679	Introduce test utils for ingest pipelines (#112733 ) Replaces the somewhat-awkward API on `ClusterAdminClient` for manipulating ingest pipelines with some test-specific utilities that are easier to use. Relates #107984 in that this change massively reduces the noise that would otherwise result from removing the trappy timeouts in these APIs.	2024-09-12 08:22:50 +01:00
Mark Vieira	4ce661cc48	Bump Elasticsearch version to 9.0.0 (#112570 )	2024-09-11 09:40:11 -07:00

1 2 3 4 5 ...

8180 commits