elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-23 22:57:29 -04:00

Author	SHA1	Message	Date
Luke Whiting	f06e9361ae	Ensure removal of index blocks does not leave key with null value (#122246 ) (#122264 ) * ES-10801 Ensure removal of index blocks does not leave key with null value * Update docs/changelog/122246.yaml	2025-02-12 03:16:34 +11:00
Parker Timmins	3c250e8f56	Remove call to super.cleanupCluster in cleanup methods which does not override parent method (#122209 ) (#122260 ) ReindexDataStreamIndexAction.cleanupCluster called EsIntegTestCase.cleanupCluster, but did not override it. This caused EsIntegTestCase.cleanupCluster to be called twice, once in ReindexDataStreamIndexAction.cleanupCluster and once when the After annotation is called on EsIntegTestCase. (cherry picked from commit `89ba03ecff`) # Conflicts: # muted-tests.yml	2025-02-12 02:58:41 +11:00
Iván Cea Fontenla	1782b57857	ESQL: Remove AggregateMapper reflection, and delegate intermediate state to suppliers (#122023 ) (#122183 ) To avoid having AggregateMapper find aggregators based on their names with reflection, I'm doing some changes: - Make the suppliers have methods returning the intermediate states - To allow this, the suppliers constructor won't receive the chanells as params. Instead, its methods will ask for them - Most changes in this PR are because of this - After those changes, I'm leaving AggregateMapper still there, as it still converts AggregateFunctions to its NamedExpressions (cherry picked from commit `7bea3a5610`) # Conflicts: # x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/planner/AggregateMapper.java	2025-02-11 11:53:00 +01:00
Luigi Dell'Aquila	f49e3c45e1	EQL: fix JOIN command validation (not supported) (#122011 ) (#122169 )	2025-02-11 19:27:56 +11:00
John Verwolf	7dce3566b7	Fix privileges for system index migration WRITE block (#122214 ) * Fix privileges for system index migration WRITE block (#121327) This PR removes a potential cause of data loss when migrating system indices. It does this by changing the way we set a "write-block" on the system index to migrate - now using a dedicated transport request rather than a settings update. Furthermore, we no longer delete the write-block prior to deleting the index, as this was another source of potential data loss. Additionally, we now remove the block if the migration fails. * Update release notes * Delete docs/changelog/122214.yaml	2025-02-11 09:51:40 +11:00
Dan Rubinstein	20b720b047	Fix get all inference endponts not returning multiple endpoints sharing model deployment (#121821 ) (#122206 ) * Fix get all inference endponts not returning multiple endpoints sharing model deployment * Update docs/changelog/121821.yaml * Clean up modelsByDeploymentId generation code --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2025-02-11 07:17:07 +11:00
Luigi Dell'Aquila	d5d7937415	ES\|QL: Remove redundant sorts from execution plan (#121156 ) (#122187 )	2025-02-11 05:25:45 +11:00
Samiul Monir	e54acc36f9	Fix - Requesting _inference_fields when using legacy format causes shard failure (#121720 ) (#122177 ) * Adding condition to verify if the field belongs to an index * Update docs/changelog/121720.yaml * Remove unnecessary comma from yaml file * remove duplicate inference endpoint creation * updating isMetadata to return true if mapper has the correct type * remove unnecessary index creation in yaml tests * Adding check if the document has returned in the yaml test * Updating test to skip time series check if index mode is standard * Refactor tests to consider verifying every metafields with all index modes * refactoring test to verify for all cases * Adding assetFalse if not time_series and fields are from time_series * updating test texts to have better description	2025-02-11 01:52:25 +11:00
David Kyle	86542f073f	[ML] Fix infer on and elasticsearch service endpoint created with a deployment id (#121428 ) (#121438 ) Fixes a bug where the deployment Id was lost creating the text embedding model configuration	2025-02-10 21:26:36 +11:00
Tim Vernum	cfcd1c61ef	Remove TLSv1.1 from default protocols (#121731 ) (#122159 ) This commit removes "TLSv1.1" from the list of default protocols in Elasticsearch (starting with ES9.0) TLSv1.1 has been deprecated by the IETF since March 2021 This affects a variety of TLS contexts, include - The HTTP Server (Rest API) - Transport protocol (including CCS and CCR) - Outgoing connections for features that have configurable SSL settings. This includes - reindex - watcher - security realms (SAML, OIDC, LDAP, etc) - monitoring exporters - inference services In practice, however, TLSv1.1 has been disabled in most Elasticsearch deployments since around 7.12 because most JDK releases have disabled TLSv1.1 (by default) starting in April 2021 That is, if you run a default installation of Elasticsearch (for any currently supported version of ES) that uses the bundled JVM then TLSv1.1 is already disabled. And, since ES9+ requires JDK21+, all supported JDKs ship with TLSv1.1 disabled by default. In addition, incoming HTTP connections to Elastic Cloud deployments have required TLSv1.2 or higher since April 2020 This change simply makes it clear that Elasticsearch does not attempt to enable TLSv1.1 and administrators who wish to use that protocol will need to explicitly enable it in both the JVM and in Elasticsearch. Resolves: #108057	2025-02-10 20:21:36 +11:00
Niels Bauman	9be965d41d	Increase timeout in `DataStreamLifecycleDownsampleDisruptionIT` (#122151 ) (#122156 ) The downsample task sometimes needs a little bit longer to complete so we bump the timeout from 60s to 120s. Fixes #122056 (cherry picked from commit `0ec2fe05ef`) # Conflicts: # muted-tests.yml	2025-02-10 19:20:21 +11:00
Nhat Nguyen	e65e8659a6	Handle rejection in DriverScheduler (#122105 ) (#122118 ) When a node is shutting down, scheduling tasks for the Driver can result in a rejection exception. In this case, we drain and close all operators. However, we don't clear the pending tasks in the scheduler, which can lead to a pending task being triggered unexpectedly, causing a ConcurrentModificationException.	2025-02-08 17:25:09 +11:00
Stanislav Malyshev	5df9d4cc4e	Fix async stop sometimes not properly collecting result (#121843 ) (#122114 ) * Fix async stop sometimes not properly collecting result (cherry picked from commit `d11dad44cc`)	2025-02-08 16:20:07 +11:00
Mary Gouseti	a699e976b0	[9.0] [Deprecation API] Adjust details in the SourceFieldMapper deprecation warning (#122041 ) (#122068 ) * [Deprecation API] Adjust details in the SourceFieldMapper deprecation warning (#122041) In this PR we improve the deprecation warning about configuring source in the mapping. - We reduce the size of the warning message so it looks better in kibana. - We keep the original message in the details. - We use an alias help url, so we can associate it with the guide when it's created. * Remove bwc code	2025-02-08 10:25:09 +11:00
Nik Everett	32bcd49748	Retry timeout tests for aggs (#122031 ) (#122076 ) The aggs timeout test waits for the agg to return and then double checks that the agg is stopped using the tasks API. We're seeing some failures where the tasks API reports that the agg is still running. I can't reproduce them because computers. This adds two things: 1. Logs the hot_threads so we can see if the query is indeed still running. 2. Retries the _tasks API for a minute. If it goes away soon after the _search returns that's fine. If it sticks around for more than a few seconds then the cancel isn't working. We wait for a minute because CI can't be trusted to do anything quickly. Closes #121993	2025-02-08 06:46:43 +11:00
Keith Massey	820934cfdc	Removing the type from the destination index when using CreateIndexFromSourceAction (#121982 ) (#122058 ) It is possible to create an index in 7.x with a single type. This fixes the CreateIndexFromSourceAction to not copy that type over when creating a destination index from a source index with a type.	2025-02-08 04:39:51 +11:00
Nhat Nguyen	f1f2af71ce	Avoid cyclic exception in ExchangeSource (#121995 ) (#122055 ) Since introducing the fail_fast (see #117410) option to remote sinks, the ExchangeSource can propagate failures that can lead to circular references. The issue occurs as follows: 1. remote-sink-1 fails with exception e1, and the failure collector collects e1. 2. remote-sink-2 fails with exception e2, and the failure collector collects e2. 3. The listener of remote-sink-2 propagates e2 before the listener of remote-sink-1 propagates e1. 4. The failure collector in ExchangeSource sees [e1, e2] and suppresses e2 to e1. The upstream sees [e2, e1] and suppresses e1 to e2, leading to a circular reference. With this change, we stop collecting failures in ExchangeSource. Labelled this non-issue for an unreleased bug. Relates #117410	2025-02-08 04:35:52 +11:00
Jonathan Buttner	3c15867569	[ML] Inference API removing _unified and using _stream instead (#121804 ) (#122042 ) * Adding proxy action * [CI] Auto commit changes from spotless * Incrementing reference count for body content and fixing tests * [CI] Auto commit changes from spotless * Refactoring * Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/inference/action/InferenceActionProxy.java Co-authored-by: David Kyle <david.kyle@elastic.co> * Addressing feedback --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: David Kyle <david.kyle@elastic.co> (cherry picked from commit `ab482350e6`) # Conflicts: # x-pack/plugin/inference/src/test/java/org/elasticsearch/xpack/inference/rest/BaseInferenceActionTests.java	2025-02-08 03:25:56 +11:00
David Kyle	f61c93733e	[ML] Give the kibana user read/write access to reindexed hidden ml indices (#121897 ) (#122037 ) Indices are reindexed on upgrade. Adds an index pattern to the role descriptor matching the upgraded indices	2025-02-08 03:19:54 +11:00
Nik Everett	f5d0a85b0d	ESQL: Update kibana signatures (#121951 ) (#122032 ) This updates the kibana signature json files in two ways: * Renames `eval` to `scalar` - that's the name we use inside of ESQL and we may as well make the name the same. * Calls the `CATEGORIZE` and `BUCKET` function `grouping` because they can only be used in the "grouping" positions of the `STATS` command. Closes #113411	2025-02-08 03:00:54 +11:00
Luigi Dell'Aquila	a3a38b5d99	ES\|QL: fix ENRICH validation for use of wildcards (#121911 ) (#122017 )	2025-02-08 00:22:07 +11:00
Ievgen Degtiarenko	d41e1d3d0a	Fix testValidFromPattern (#121996 ) (#122009 ) (cherry picked from commit `c92b7b568e`)	2025-02-07 22:39:45 +11:00
Luigi Dell'Aquila	c2da1e19c1	ES\|QL: Improve random query generation tests (#121750 ) (#122001 )	2025-02-07 21:26:56 +11:00
Ryan Ernst	f5d3f3810f	Add 9.0 patch transport version constants #121985 (#121986 ) * Add 9.0 patch transport version constants #121985 Transport version changes must be unique per branch. Some transport version changes meant for 9.0 are missing unique backport constants. This is a backport of #121985, adding unique transport version patch numbers for each change intended for 9.0. * match constant naming in main	2025-02-07 12:10:23 +11:00
Pat Whelan	b7367490a9	[ML] Parse mid-stream errors from OpenAI and EIS (#121806 ) (#121961 ) When we are already parsing events, we can receive errors as the next event. OpenAI formats these as: ``` event: error data: <payload> ``` Elastic formats these as: ``` data: <payload> ``` Unified will consolidate them into the new error structure.	2025-02-07 11:14:10 +11:00
Pat Whelan	8611b74c03	[Transform] Recreate Notifications Index (#121912 ) (#121943 ) If the notification index and alias gets deleted, recreate the index. Fix #121909	2025-02-07 08:09:17 +11:00
Mark Tozzi	e0dac2e855	Aggregations cancellation after collection (#120944 ) (#121921 ) This PR addresses issues around aggregations cancellation, mentioned in https://github.com/elastic/elasticsearch/issues/108701 and other places. In brief, during aggregations collection time, we respect cancellation via the mechanisms in the searcher to poison cancelled queries. But once the aggregation finishes collection, there is no further need to interact with the searcher, so we cannot rely on that for cancellation checking. In particular, deeply nested aggregations can spend a long time constructing the results tree. Checking for cancellation is a trade off, as the check itself is somewhat expensive (it involves a volatile read), so we want to balance checking often enough that cancelled queries aren't taking up resources for a long time, but not so frequently that it slows down most aggregation queries. Our first attempt to this is to check once when we go to build sub-aggregations, as the worst cases for this that we've seen involve needing to build deep sub-aggregation trees. Checking at sub-aggregation construction time also provides a conveniently centralized method call to add the check to. --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> Co-authored-by: Nik Everett <nik9000@gmail.com>	2025-02-07 04:25:44 +11:00
Ievgen Degtiarenko	ebac62628a	Ensure cluster string could be quoted (#120355 ) (#121917 ) Currently we accept "remote:index", remote:"index" but not "remote":"index" as a valid index pattern. This change fixes this.	2025-02-07 03:43:55 +11:00
Nhat Nguyen	67dffc931f	Keep outstanding pages when finish buffer early (#121857 ) (#121877 ) Today, the exchange buffer of an exchange source is finished in two cases: (1) when the downstream pipeline has received enough data and (2) when all remote sinks have completed. In the first case, outstanding pages could be safely discarded. In the second case, no new pages should be received after finishing. In both scenarios, discarding all outstanding pages was safe if noMoreInputs was switched while adding pages. However, with the stop API, the buffer may now finish while keeping outstanding pages, and new pages may still be received. This change updates the exchange buffer to discard only the incoming page when noMoreInputs is switched, rather than all pages in the buffer. Closes #120757	2025-02-07 03:22:08 +11:00
Jonathan Buttner	fe988187a5	[9.0] [ML] Support revoking inference default endpoint authorization (#121326 ) (#121906 ) * [ML] Support revoking inference default endpoint authorization (#121326) * Starting revoke * Adding integration tests * More integration tests * Adding test for deleting default inference endpoint via rest call * Removing task type any * Addressing feedback and adding test * Fixing tests	2025-02-07 02:40:30 +11:00
Fang Xing	cbcdd0aa19	[ES\|QL] Take named parameters for identifier and pattern out of snapshot (#121850 ) (#121868 )	2025-02-06 17:40:34 +11:00
Jordan Powers	5813fa395c	Copy metrics and default_metric properties when downsampling aggregate_metric_double (#121727 ) (#121855 ) Fixes #119696 and #96076	2025-02-06 11:45:11 +11:00
Ryan Ernst	9a477f7816	Rename environment dir accessors (#121803 ) (#121834 ) The node environment has many paths. The accessors for these currently use a "file" suffix, but they are always directories. This commit renames the accessors to make it clear these paths are directories.	2025-02-06 10:14:00 +11:00
Nik Everett	b5442275ee	ESQL: Add description to status and profile (#121783 ) (#121823 ) This adds a `task_description` field to `profile` output and task `status`. This looks like: ``` ... "profile" : { "drivers" : [ { "task_description" : "final", "start_millis" : 1738768795349, "stop_millis" : 1738768795405, ... "task_description" : "node_reduce", "start_millis" : 1738768795392, "stop_millis" : 1738768795406, ... "task_description" : "data", "start_millis" : 1738768795391, "stop_millis" : 1738768795404, ... ``` Previously you had to look at the signature of the operators in the driver to figure out what the driver is doing. You had to know enough about how ESQL works to guess. Now you can look at this description to see what the server thinks it is doing. No more manual classification. This will be useful when debugging failures and performance regressions because it is much easier to use `jq` to group on it: ``` \| jq '.profile[] \| group_by(.task_description)[]' ```	2025-02-05 18:01:45 -05:00
Nik Everett	4d0b58b0f5	ESQL: Fix a bug in TOP (#121552 ) (#121825 ) Fix a bug in TOP which surfaces when merging results from ordinals. We weren't always accounting for oversized arrays when checking if we'd ever seen a field. This changes the oversize itself to always size on a bucket boundary. The test for this required random `bucketSize` - without that the oversizing frequently wouldn't cause trouble.	2025-02-05 17:59:18 -05:00
Pat Whelan	87a808ab99	[ML] Change format for Unified Chat error responses (#121396 ) (#121829 ) Unified Chat Completion error responses now forward code, type, and param to in the response payload. `reason` has been renamed to `message`. Notes: - `XContentFormattedException` is a `ChunkedToXContent` so that the REST listener can call `toXContentChunked` to format the output structure. By default, the structure forwards to our existing ES exception structure. - `UnifiedChatCompletionException` will override the structure to match the new unified format. - The Rest, Transport, and Stream handlers all check the exception to verify it is a UnifiedChatCompletionException. - OpenAI response handler now reads all the fields in the error message and forwards them to the user. - In the event that a `Throwable` is a `Error`, we rethrow it on another thread so the JVM can catch and handle it. We also stop surfacing the JVM details to the user in the error message (but it's still logged for debugging purposes).	2025-02-06 09:57:08 +11:00
Parker Timmins	3da916f239	Fix tests broken because future not completed during cleanup (#121782 ) (#121819 ) A future.actionGet was missing from the delete pipeline action execution in the test cleanup, causing all tests to fail intermittently. Also replace actionGet with safeGet. (cherry picked from commit `0f6b80a98f`) # Conflicts: # muted-tests.yml	2025-02-06 08:56:19 +11:00
Artem Prigoda	83eb627984	[9.0] Remove the `failures` field from snapshot responses (#114496 ) (#121770 ) Backports #114496 to 9.0 > Failure handling for snapshots was made stricter in #107191 (8.15), so this field is always empty since then. Clients don't need to check it anymore for failure handling, we can remove it from API responses in 9.0	2025-02-05 17:41:28 +01:00
Pat Whelan	0691426988	[ML] Skip Usage stats update when ML is disabled (#121559 ) (#121766 ) Do not call ML's GetDeploymentStatsAction API when ML is disabled in the cluster, instead return the inference configurations as-is. Fix #121532	2025-02-06 03:01:15 +11:00
Tim Grein	9e3b2e0b5b	[Inference API] Fix node local rate limit calculator tests for non-snapshot builds (#121527 )	2025-02-05 11:13:34 +01:00
Parker Timmins	dfcea6237f	Add pipeline to clean docs during data stream reindex (#121617 ) (#121728 ) Add the pipeline "reindex-data-stream-pipeline" to the reindex request within ReindexDataStreamIndexAction. This cleans up documents as needed before inserting into the destination index. Currently, the pipeline only sets a timestamp field with a value of 0, if the document is missing a timestamp field. This is needed because existing indices which are added to a data stream may not contain a timestamp, but reindex validates that a timestamp field exists when creating data stream destination indices. This pipeline is managed by ES, but can be overriden by users if necessary. To do this, the version field of the pipeline should be set to a value higher than the MigrateRegistry version.	2025-02-05 12:13:54 +11:00
Larisa Motova	f89ac2c4c0	Rename AggregateDoubleMetric to *MetricDouble (#121254 ) (#121701 ) Some areas of the code call this field type AggregateDoubleMetric and others AggregateMetricDouble, but the docs use aggregate_metric_double, so for consistency this commit refactors the former into the latter.	2025-02-05 07:03:26 +11:00
Nhat Nguyen	a2c172cec7	Clean up exchanges in EsqlNodeFailureIT (#121633 ) (#121692 ) If the query hits the failing index first, we will cancel the request, preventing exchange-sink requests and data-node requests from reaching another data node. As a result, exchange sinks could stay for 30 seconds.	2025-02-05 05:37:13 +11:00
Fang Xing	783d8720b6	[ES\|QL] Change function_named_parameters in Kibana doc to expected format (#121585 ) (#121687 ) * change function_named_parameters in kibana doc to expected format	2025-02-05 05:34:33 +11:00
Keith Massey	a84c466ed1	Fail the reindex data stream task if any document fails to reindex (#121591 ) (#121677 )	2025-02-05 03:45:18 +11:00
Dan Rubinstein	4d83a17a03	Fix inference update API calls with task_type in body or deployment_id defined (#121231 ) (#121534 ) * Fix inference update API calls with task_type in body or deployment_id defined * Update docs/changelog/121231.yaml * Fixing test * Reuse existing deployment ID retrieval logic --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2025-02-04 10:57:59 -05:00
Slobodan Adamović	a0f0cea6da	Disable queryable built-in roles feature for core and datastream YAML tests (#121541 ) (#121663 ) This PR disables the "queryable built-in roles" feature for the `CoreWithSecurityClientYamlTestSuiteIT` and `DataStreamsClientYamlTestSuiteIT` YAML test suites. The feature was enabled by default in the #120323 PR, which asynchronously creates the `.security` index after cluster formation and indexes all built-in roles. The asynchronous creation of the `.security` index introduces non-deterministic behavior in our YAML tests. Since these test suites are not intended to verify the queryable built-in roles functionality, having the feature enabled introduced flakiness and unnecessary complexity to handle `.security` in existing tests. These tests would have to exclude the `.security` index in some way (by adjusting permissions or API calls), and in the end cleanup (delete) the `.security` index. This simply adds overhead without much gain. The feature is already test covered by `XPackRestIT` and other integration/REST tests, disabling it here would not compromise test coverage. Instead, it ensures these suites remain deterministic and focused on the behaviors they were designed to verify. Resolves https://github.com/elastic/elasticsearch/issues/121536 Resolves https://github.com/elastic/elasticsearch/issues/121513 Resolves https://github.com/elastic/elasticsearch/issues/121484 Resolves https://github.com/elastic/elasticsearch/issues/121478 Resolves https://github.com/elastic/elasticsearch/issues/121290 Resolves https://github.com/elastic/elasticsearch/issues/121246 Resolves https://github.com/elastic/elasticsearch/issues/121242 Resolves https://github.com/elastic/elasticsearch/issues/121238 Resolves https://github.com/elastic/elasticsearch/issues/121186 Resolves https://github.com/elastic/elasticsearch/issues/121131 Resolves https://github.com/elastic/elasticsearch/issues/121130 Resolves https://github.com/elastic/elasticsearch/issues/121128 Resolves https://github.com/elastic/elasticsearch/issues/121014 Resolves https://github.com/elastic/elasticsearch/issues/120965 Resolves https://github.com/elastic/elasticsearch/issues/120920 Resolves https://github.com/elastic/elasticsearch/issues/120890 (cherry picked from commit `d1beb01d27`) # Conflicts: # muted-tests.yml	2025-02-05 01:59:42 +11:00
Tim Grein	7abb926ca4	[9.0] Remove feature flag check in BaseTransportInferenceAction and rely on Noop implementation (#121270 ) (#121640 )	2025-02-04 15:11:00 +01:00
Mary Gouseti	8ff5ac8b05	`DeprecationInfoAction` refactoring (#121181 ) (#121638 ) This refactoring was motivated by the following issues with the current state of the code: - The `TransformDeprecationChecker` is listed as plugin checker, but later we remove is from the `plugin_settings` and add it to the `cluster_settings`. This made me consider that the checker might be dealing with transform deprecation warnings but if they are listed under the `cliuster_settings`, it fits better to be part of `ClusterDeprecationChecker`. - The `DeprecationInfo` is a data class, but it has a method `from` which constructs an `DeprecationInfo.Response` instance. However, this is not a simple factory class but it actually runs all the checks and it also tries to assert that it is not executed on a transport thread. Considering this, I thought it might fit better to the `TransportDeprecationInfoAction`, this way all the logic is in one place and all the checkers are wired and used in the same class. - Constructing the node settings deprecation issues requires to merge the deprecation warnings of the individual nodes. We considered bringing together the execution of the remote request and the construction of the response in a new class called `NodeDeprecationChecker` that resembles the patterns of the other Checker classes. - Reinstated the `PLUGIN_CHECKERS` even if we have only one check, so other developers can easier add their plugin checks. - Finally, we noticed that the way we synthesise the remote requests is difficult to read and maintain because each call is nested under the previous one. We propose in this PR a different pattern that uses the `RefCountingListener` to combine the different remote calls and store their results in a container class named `PrecomputedData` - Bonus: Removed the `LegacyIndexTemplateDeprecationChecker.java` which was not used.	2025-02-04 14:41:19 +02:00
Nhat Nguyen	35ccfaf4c9	Wait for exchange source to complete before verifying results (#121603 ) (#121610 ) Closes #118732	2025-02-04 13:50:44 +11:00

1 2 3 4 5 ...

18502 commits