elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-29 01:44:36 -04:00

Author	SHA1	Message	Date
Salvatore Campagna	e2281a1158	Introduce an `IndexSettingsProvider` to inject logsdb index mode (#113505 ) Here we introduce a new implementation of `IndexSettingProvider` whose goal is to "inject" the `index.mode` setting with value `logsdb` when a cluster setting `cluster.logsdb.enabled` is `true`. We also make sure that: * the existing `index.mode` is not set * the datastream name matches the `logs--` pattern * `logs@settings` component template is used	2024-09-26 14:44:03 +02:00
Kostas Krikellas	fffe8844e9	Apply auto-flattening to `subobjects: auto` (#112092 ) * Introduce mode `subobjects=auto` for objects * Update docs/changelog/110524.yaml * compilation error * tests and fixes * refactor * spotless * more tests * fix nested objects * fix test * update fetch test * add QA coverage * update tests * update tests * update tests * Apply auto-flattening to `subobjects: auto` * Update docs/changelog/112092.yaml * sync * dont flatten subobjects auto * refine test * fix path for nested flattened objects and dynamic * document `subobjects: auto` * Apply suggestions from code review Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com> * comment updates * restore indentation in comment * update comment * update comment * update comment * update comment * rename isFlattenable * add test for dynamic template * fix copy_to and noop dynamic updates * tests * update comment * fix tests * update cluster feature in yaml test * address comments --------- Co-authored-by: Felix Barnsteiner <felixbarny@users.noreply.github.com>	2024-09-26 11:42:40 +03:00
Albert Zaharovits	71ccf2089f	Merge main into multi-project	2024-09-26 09:57:32 +03:00
Mary Gouseti	3d7904bee3	Add template builder (#113444 ) Since we are enriching the component templates with more entries such as the data stream lifecycle and in the future the data stream options, we add a template builder to help with the code, especially tests. To highlight the value and prepare for the PRs that will add the data stream options to the template we replace calls to the constructor with all arguments by the builder: - when there are aguements with null values, or - when we copy another template and change only a few fields. This prepares the ground, so when we add data stream options, we will not need to edit all these places.	2024-09-25 19:00:17 +10:00
Tim Vernum	ad6435dede	Merge main into multi-project	2024-09-25 12:49:01 +10:00
Felix Barnsteiner	8d223cbf7a	Add support for multi-value dimensions (#112645 ) Closes https://github.com/elastic/elasticsearch/issues/110387 Having this in now affords us not having to introduce version checks in the ES exporter later. We can simply use the same serialization logic for metric attributes as we do for other signals. This also enables us to properly map `*.ip` fields to the ip field type as ip fields containing a list of IPs are not converted to a comma-separated list.	2024-09-23 17:31:18 +10:00
Tim Vernum	f6458344ce	Merge main into multi-project	2024-09-23 11:36:52 +10:00
Mary Gouseti	f4f075a2cc	Add failure store status in index response of data streams (#112816 ) The failure store status is a flag that indicates how the failure store was used or could be used if enabled. The user can be informed about the usage of the failure store in the following way: When relevant we add the optional field `failure_store` . The field will be omitted when the use of the failure store is not relevant. For example, if a document was successfully indexed in a data stream, if a failure concerns an index or if the opType is not index or create. In more detail: - when we have a “success” create/index response, the field `failure_store` will not be present if the documented was indexed in a backing index. Otherwise, if it got stored in the failure store it will have the value `used`. - when we have a “rejected“ create/index response, meaning the document was not persisted in elasticsearch, we return the field `failure_store` which is either `not_enabled`, if the document could have ended up in the failure store if it was enabled, or `failed` if something went wrong and the document was not persisted in the failure store, for example, the cluster is out of space and in read-only mode. We chose to make it an optional field to reduce the impact of this field on a bulk response. The value will exist in the java object but it will not be returned to the user. The only values that will be displayed are: - `used`: meaning this document was indexed in the failure store - `not_enabled`: meaning this document was rejected but could have been stored in the failure store if it was applicable. - `failed`: meaning this failed document, failed to be stored in the failure store. Example: ``` "errors": true, "took": 202, "items": [ { "create": { "_index": ".fs-my-ds-2024.09.04-000002", "_id": "iRDDvJEB_J3Inuia2zgH", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 6, "_primary_term": 1, "status": 201, "failure_store": "used" } }, { "create": { "_index": "ds-no-fs", "_id": "hxDDvJEB_J3Inuia2jj3", "status": 400, "error": { "type": "document_parsing_exception", "reason": "[1:153] failed to parse field [count] of type [long] in document with id 'hxDDvJEB_J3Inuia2jj3'. Preview of field's value: 'bla'", "caused_by": { "type": "illegal_argument_exception", "reason": "For input string: \"bla\"" } } }, "failure_store": "not_enabled" }, { "create": { "_index": ".ds-my-ds-2024.09.04-000001", "_id": "iBDDvJEB_J3Inuia2jj3", "_version": 1, "result": "created", "_shards": { "total": 2, "successful": 1, "failed": 0 }, "_seq_no": 7, "_primary_term": 1, "status": 201 } } ] ```	2024-09-20 10:53:39 +03:00
Lee Hinman	4f221bb4c6	Mark data streams stats API as internal-only (again) (#112712 ) This is a redo of https://github.com/elastic/elasticsearch/pull/108745 which was reverted. Now that https://github.com/elastic/elasticsearch/pull/112303 has been merged, there is an alternative to retrieve the `maximum_timestamp`.	2024-09-19 13:24:02 -06:00
Kostas Krikellas	e244216c0f	Configure keeping source in FieldMapper (#112706 ) Introduces per-field param `synthetic_source_keep` that overrides the behavior for keeping the field source in synthetic source mode: - `none` : no source is stored - `arrays`: the incoming source is recorded as-is for arrays of a given field - `all`: the incoming source is recorded as is for both singleton and array values of a given field Related to #112012	2024-09-19 23:29:09 +10:00
Kostas Krikellas	4ff4384550	Retrieve the source for objects and arrays within arrays in a separate parsing phase (#113027 ) In synthetic source, storing array elements to `_ignored_source` may hide other, regular elements from showing up during source synthesizing. This is due to contents from `_ignored_source` taking precedence over matching fields from regular source loading. To avoid this, arrays are pre-emptively tracked and marked for source storing, if any of their elements needs to store its source. A second doc parsing phase is introduced that checks for fields missing values and records their source, while skipping objects and arrays that don't contain any such fields. Fixes #112374	2024-09-19 20:07:31 +10:00
Tim Vernum	d5d5131e25	Merge main into multi-project	2024-09-19 18:52:20 +10:00
Lee Hinman	b94720dca5	Deprecate dot-prefixed indices and composable template index patterns (#112571 ) This commit adds a module emitting a deprecation warning when a dot-prefixed index is manually or automatically created, or when a composable index template with an index pattern that uses a dot-prefix is created. This pattern warns that in the future these indices will not be allowed. In a future breaking change (10.0.0 maybe?) the deprecation can then be changed to an exception. These deprecations are only displayed when a non-operator user is using the API (one that does not set the `X-elastic-product-origin` header).	2024-09-19 05:29:53 +10:00
Pete Gillin	81041b47d4	[TEST] Assert DSL merge policy respects end date (#113038 ) [TEST] Assert DSL merge policy respects end date Backing indexes with an end date in the future may still get writes, so DSL should not apply the merge policy (first configuring the settings on the index, then doing the force merge) until that time has passed. The implementation already does this, because `DataStreamLifecycleService.run()` calls `timeSeriesIndicesStillWithinTimeBounds` and adds the resulting indices to `indicesToExcludeForRemainingRun` before calling `maybeExecuteForceMerge`. This change simply adds a unit test to ensure that this behaviour does not regress. Closes #109030	2024-09-18 12:03:49 +01:00
Nhat Nguyen	af7ed9515f	Enable ignore_malformed in logsdb (#113072 ) This change enables ignore_malformed by default for newly created logsdb indices. Closes #106822	2024-09-17 22:41:41 -07:00
Lee Hinman	4a0ccbf4b4	Fix verbose get data stream API not requiring extra privileges (#112973 ) * Fix verbose get data stream API not requiring extra privileges When a user uses the `GET /_data_stream?verbose` API to retrieve the verbose version of the response (which includes the `maximum_timestamp`, as added in #112303), the response object should be performed with the same privilege-checking as the get-data-stream API, meaning that no extra priveleges should be required return the field. This commit makes the Transport action use an entitled client so that extra privileges are not required, and adds a test to ensure that it works. * Update docs/changelog/112973.yaml	2024-09-17 10:03:44 -06:00
Salvatore Campagna	f7880ae85f	LogsDB data migration integration testing (#112710 ) Here we test reindexing logsdb indices, creating and restoring snapshots. Note that logsdb uses synthetic source and restoring source only snapshots fails due to missing _source.	2024-09-17 16:26:48 +02:00
Oleksandr Kolomiiets	7923870b42	Fix license headers in test files (#112965 )	2024-09-16 13:45:33 -07:00
Oleksandr Kolomiiets	9de285e0f1	Add LogsDB challenge tests for reindexing (#112849 )	2024-09-16 13:32:40 -07:00
David Turner	3a8835853f	Remove unnecessary bwc from get-aliases API (#112797 ) Dropping support for pre-8.12 requests from remote nodes, and also cleaning up some unnecessary abstraction in the request builder hierarchy. Relates #101815 Relates #107984 (drops some unnecessary trappy timeouts)	2024-09-16 06:31:37 +01:00
Niels Bauman	c41ed527b3	Merge main into multi-project	2024-09-14 10:52:45 +02:00
Mark Vieira	a59c182f9f	Add AGPLv3 as a supported license	2024-09-13 15:29:46 -07:00
Niels Bauman	9968e076f7	Merge main into multi-project # Conflicts: # x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/action/ReservedLifecycleAction.java	2024-09-13 12:52:08 +02:00
Oleksandr Kolomiiets	44c9271562	Ensure that fields copied using copy_to are not present in synthetic source (#112625 )	2024-09-12 12:18:25 -07:00
Niels Bauman	158d66cc2e	Merge main into multi-project	2024-09-12 15:16:45 +02:00
David Turner	8607d40679	Introduce test utils for ingest pipelines (#112733 ) Replaces the somewhat-awkward API on `ClusterAdminClient` for manipulating ingest pipelines with some test-specific utilities that are easier to use. Relates #107984 in that this change massively reduces the noise that would otherwise result from removing the trappy timeouts in these APIs.	2024-09-12 08:22:50 +01:00
David Turner	2db52aeb0e	Fix trappy timeouts in downsample action (#112734 ) Relates #107984	2024-09-11 13:18:03 +01:00
Albert Zaharovits	8e1a004260	Metadata -> ProjectMetadata related to index creation (MP-1644) Metadata -> ProjectMetadata related to index creation	2024-09-11 13:26:49 +03:00
Oleksandr Kolomiiets	082e7211b3	Use fallback synthetic source for copy_to and doc_values: false cases (#112294 )	2024-09-10 12:12:51 -07:00
Tim Vernum	56ba3385c8	Merge main into multi-project	2024-09-10 16:59:33 +02:00
David Turner	ecd887d651	Remove unused compat shims from `o.e.a.datastreams` (#112697 ) Relates #111474 Relates #107984	2024-09-10 21:55:34 +10:00
David Turner	8f07d60c2c	Fix trappy timeouts in `o.e.a.a.cluster.*` (#112674 ) Removes all usages of `TRAPPY_IMPLICIT_DEFAULT_MASTER_NODE_TIMEOUT` in cluster-related APIs in `:server`. Relates #107984	2024-09-10 08:17:09 +01:00
Mary Gouseti	a43ffd57c2	Do not send version conflicts to failure store (#112537 ) When indexing to a data stream with a failure store it's possible to get a version conflict. The reproduction path is the following: ``` PUT /_bulk {"create":{"_index": "my-ds-with-fs", "_id": "1"}} {"@timestamp": "2022-01-01", "baz": "quick", "a": "brown", "b": "fox"} {"create":{"_index": "my-ds-with-fs", "_id": "1"}} {"@timestamp": "2022-01-01", "baz": "lazy", "a": "dog"} ``` We would like the second document to not be sent to the failure store and return an error to the user: ``` { "errors" : true, "took" : 409, "items" : [ { "create" : { "_index" : ".ds-my-ds-with-fs-xxxxx-xxxx", "_id" : "1", "_version" : 1, "result" : "created", "_shards" : { "total" : 2, "successful" : 1, "failed" : 0 }, "_seq_no" : 0, "_primary_term" : 1, "status" : 201 } }, { "create" : { "_index" : ".ds-my-ds-with-fs-xxxxx-xxxx", "_id" : "1", "status" : 409, "error" : { "type" : "version_conflict_engine_exception", "reason" : "[1]: version conflict, document already exists (current version [1])", "index_uuid" : ".....", "shard" : "0", "index" : ".ds-my-ds-with-fs-xxxxx-xxxx" } } } ] } ``` The version conflict doc is counted as a rejected doc in APM telemetry.	2024-09-09 20:47:51 +10:00
Nikolaj Volgushev	ee68d0cb03	Bring back operator and serverless request marking (#112554 ) Reverts https://github.com/elastic/elasticsearch/pull/111810	2024-09-06 19:01:10 +10:00
Tim Vernum	26f0b75a3f	Merge main into multi-project	2024-09-06 12:37:27 +10:00
Mark Vieira	24f33e95e8	Ensure rest compatibility tests are run when appropriate (#112526 )	2024-09-05 08:22:48 -07:00
Kostas Krikellas	d5bae2cdee	Control storing array source with index setting (#112397 ) Introduce an index setting that forces storing the source of leaf field and object arrays in synthetic source mode. Nested objects are excluded as they already preserve ordering in synthetic source. Next step is to introduce override params at the mapper level that will allow disabling the source, or storing the source for arrays (if not enabled at index level), or storing the source for both arrays and singletons. This will happen in follow-up changes, so that we can benchmark the impact of this change in parallel. Related to #112012	2024-09-05 01:12:19 +10:00
Albert Zaharovits	36a3eb7edc	Merge main into multi-project	2024-09-04 10:10:04 +03:00
Albert Zaharovits	28f4a7b4f8	Index Settings provider with project metadata (MP-1630) Restricts the "index settings" provider that's invoked when creating new indices to only inspect the current project's metadata (rather than the whole global metadata).	2024-09-03 15:12:47 +03:00
Niels Bauman	62d544e03d	Allow pathRestricted param in RestGetDataStreamsAction (#112434 ) This is a follow-up of #112303 which prohibited this parameter. This resulted in test failures on serverless.	2024-09-03 01:35:07 +10:00
Mary Gouseti	91f4023e27	Expose global retention settings via data stream lifecycle API (#112210 ) In this PR we expose the global retention via the `GET _data_stream/{target}/_lifecycle` API. Since the global retention is a main feature of the data stream lifecycle we chose to expose it by default. ``` GET /_data_stream/my-data-stream/_lifecycle { "global_retention": { "default_retention": "7d", "max_retention": "365d" }, "data_streams": [...] } ```	2024-09-02 18:40:08 +10:00
Lee Hinman	4ae88f98dc	Add 'verbose' flag retrieving maximum_timestamp for get data stream API (#112303 ) This commit adds support for the `verbose` querystring parameter to the get data stream API (`GET /_data_stream/{name}`). The flag defaults to "false". When set to true, the `maximum_timestamp` for the data stream will be retrieved and returned for each data stream retrieved. This is the same information available from the data stream stats API (and internally uses the same action to retrieval).	2024-08-31 03:18:15 +10:00
Oleksandr Kolomiiets	2dae0533a7	LogsDB QA tests - add dynamic mapping support (#112321 )	2024-08-29 12:22:29 -07:00
Tim Vernum	a100bc3131	Merge main into multi-project	2024-08-28 20:22:59 +10:00
Mary Gouseti	bed6e18fa3	Exclude internal data streams from global retention (#112100 ) With #111972 we enable users to set up global retention for data streams that are managed by the data stream lifecycle. This will allow users of elasticsearch to have a more control over their data retention, and consequently better resource management of their clusters. However, there is a small number of data streams that are necessary for the good operation of elasticsearch and should not follow user defined retention to avoid surprises. For this reason, we put forth the following definition of internal data streams. A data stream is internal if it's either a system index (system flag is true) or if its name starts with a dot. This PR adds the `isInternalDataStream` param in the effective retention calculation making explicit that this is also used to determine the effective retention.	2024-08-28 11:28:35 +03:00
Simon Cooper	9db1778878	Use StreamOutput::writeWriteable instead of writeTo directly (#112027 )	2024-08-27 09:21:43 +01:00
Michael Peterson	0d371978e8	Search coordinator uses event.ingested in cluster state to do rewrites (#111523 ) * Search coordinator uses event.ingested in cluster state to do rewrites Min/max range for the event.ingested timestamp field (part of Elastic Common Schema) was added to IndexMetadata in cluster state for searchable snapshots in #106252. This commit modifies the search coordinator to rewrite searches to MatchNone if the query searches a range of event.ingested that, from the min/max range in cluster state, is known to not overlap. This is the same behavior we currently have for the @timestamp field.	2024-08-26 09:53:26 -04:00
Martijn van Groningen	32b4aa3c44	Fix TSDBIndexingIT#testTrimId() test failure. (#112194 ) Sometimes initial indexing results into exactly one segment. However, multiple segments are needed to perform the force merge that purges stored fields for _id field in a later stage of the test. This change tweaks the test such that an extra update is performed after initial indexing. This should always create an extra segment, so that this test can actual purge stored fields for _id field. Closes #112124	2024-08-26 13:52:20 +07:00
Parker Timmins	1072f2bbab	Add interval based SLM scheduling (#110847 ) Add the ability to schedule an SLM policies with a time unit interval schedule rather than a cron job schedule. For example, an slm policy can be created with the argument "schedule":"30m". This will create a policy that will run 30 minutes after the policy modification_date. It will then run again every time another 30 minutes has passed. Every time the policy is changed, the next snapshot will be re-scheduled to run one interval after the new modification date.	2024-08-22 21:15:29 -05:00
Mary Gouseti	ed60470518	Display effective retention in the relevant data stream APIs (#112019 )	2024-08-22 17:42:49 +03:00

... 3 4 5 6 7 ...

674 commits