elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-22 22:27:47 -04:00

Author	SHA1	Message	Date
Martijn van Groningen	2b6a7fed44	Fix issues with ReinitializingSourceProvider (#118370 ) (#118430 ) The previous fix to ensure that each thread uses its own SearchProvider wasn't good enough. The read from `perThreadProvider` field could be stale and therefore returning a previous source provider. Instead the source provider should be returned from `provider` local variable. This change also addresses another issue, sometimes current docid goes backwards compared to last seen docid and this causes issue when synthetic source provider is used, as doc values can't advance backwards. This change addresses that by returning a new source provider if backwards docid is detected. Closes #118238	2024-12-11 22:28:43 +11:00
Nik Everett	32076cdff5	ESQL: Opt into extra data stream resolution (#118378 ) (#118390 ) * ESQL: Opt into extra data stream resolution This opts ESQL's data node request into extra data stream resolution. * Update docs/changelog/118378.yaml	2024-12-11 09:38:54 +11:00
Joe Gallo	eac54624fb	Fix log message format bugs (#118354 ) (#118386 )	2024-12-11 09:09:35 +11:00
Adam Demjen	02e296f286	[8.17] Update sparse text embeddings API route for Inference Service (#118368 ) * Update sparse text embeddings API route for Inference Service * Update docs/changelog/118368.yaml	2024-12-10 13:39:27 -05:00
Martijn van Groningen	d34f525e0e	Update fallback setting (#118237 ) (#118322 ) Update synthetic_source_fallback_to_stored_source setting to be an operator only setting.	2024-12-10 23:25:43 +11:00
Niels Bauman	9cb1f96d3a	[8.17] Fix enrich cache size setting name (#117575 ) (#118287 ) * Fix enrich cache size setting name (#117575) The enrich cache size setting accidentally got renamed from `enrich.cache_size` to `enrich.cache.size` in #111412. This commit updates the enrich plugin to accept both names and deprecates the wrong name. * Remove `UpdateForV10` annotation	2024-12-10 05:19:31 +11:00
Nhat Nguyen	c57f1c4e56	Ignore cancellation exceptions (#117657 ) (#118169 ) (#118181 ) Today, when an ES\|QL task encounters an exception, we trigger a cancellation on the root task, causing child tasks to fail due to cancellation. We chose not to include cancellation exceptions in the output, as they are unhelpful and add noise during problem analysis. However, these exceptions are still slipping through via RefCountingListener. This change addresses the issue by introducing ESQLRefCountingListener, ensuring that no cancellation exceptions are returned.	2024-12-07 06:04:22 +11:00
Martijn van Groningen	94581dbc1f	Fix mocking in SyntheticSourceLicenseServiceTests (#118155 ) (#118179 ) Some mock verifies where missing and `LicenseState#copyCurrentLicenseState(...)` wasn't always mocked. And because of incorrect mocking the testGoldOrPlatinumLicenseCustomCutoffDate() test had an incorrect assertion.	2024-12-07 04:54:11 +11:00
David Kyle	69a3f84c22	[ML] Wait for the worker service to shutdown before closing task processor (#117920 ) (#118165 ) (#118171 )	2024-12-07 03:52:35 +11:00
Matteo Piergiovanni	8f4f54e888	fixes and unmutes testSearchableSnapshotShardsAreSkipped... (#118133 ) (#118139 ) (cherry picked from commit `0a2c9fbc29`) # Conflicts: # muted-tests.yml	2024-12-06 12:14:15 +01:00
Martijn van Groningen	667a05dacf	Update synthetic source cutoff date (#118069 ) (#118086 ) Updating from 01-02-2025T00:00:00UTC to 04-02-2025T00:00:00UTC	2024-12-06 03:31:46 +11:00
Nik Everett	9cff1ac28f	ESQL: Fix a bug in LuceneQueryExpressionEvaluator (#117252 ) (#117279 ) (#118078 ) * ESQL: Fix a bug in LuceneQueryExpressionEvaluator This fixes Lucene usage bug in `LuceneQueryExpressionEvaluator`, the evaluator we plan to use to run things like `MATCH` when we can't push it to a source operator. That'll be useful for things like: ``` FROM foo \| STATS COUNT(), COUNT() WHERE MATCH(message, "error") ``` Explanation: When using Lucene's `Scorer` and `BulkScorer` you must stay on the same thread. It's a rule. Most of the time nothing bad happens if you shift threads, but sometimes things explode and Lucene doesn't work. Driver can shift from one thread to another - that's just how it's designed. It's a "yield after running a while" kind of thing. In tests we sometimes get a version of the `Scorer` and `BulkScorer` that assert that you don't shift threads. That is what caused this test failure. Anyway! This builds protection into `LuceneQueryExpressionEvaluator` so that if it does shift threads then it'll rebuild the `Scorer` and `BulkScorer`. That makes the test happy and makes even the most grump Lucene object happy. Closes #116879	2024-12-06 03:04:45 +11:00
Nikolaj Volgushev	fbb42f19e8	Fix ProfileIntegTests (#117888 ) (#118060 ) The test setup for `ProfileIntegTests` is flawed, where the full name of a user can be a substring of other profile names (e.g., `SER` is a substring of `User <random-string>-space1`) -- when that's passed into suggest call with the `` space, we get a match on all profiles, instead of only the one profile expected in the test, since we are matching on e.g. `SER`. This PR restricts the setup to avoid the wildcard profile for that particular test. Closes: https://github.com/elastic/elasticsearch/issues/117782	2024-12-05 22:26:26 +11:00
Martijn van Groningen	7cb1cbe0f6	[8.17] Address mapping and compute engine runtime field issues (#117792 ) (#118048 ) * Address mapping and compute engine runtime field issues (#117792) This change addresses the following issues: Fields mapped as runtime fields not getting stored if source mode is synthetic. Address java.io.EOFException when an es\|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1) Address concurrency issue when runtime fields get pushed down to Lucene. (2) 1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same SourceProvider evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by SourceProvider. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position. Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api. 2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same SourceProvider instance then gets access by multiple threads concurrently. SourceProviders implementations are not designed to handle concurrent access. Closes #117644 * fixed compile error after backporting	2024-12-05 20:52:13 +11:00
Panagiotis Bailis	d4da8ea1a9	[8.17] Fix for propagating filters from compound to inner retrievers (#117914 ) (#118045 ) * Fix for propagating filters from compound to inner retrievers (#117914) * Update RRFRetrieverBuilderIT.java	2024-12-05 19:43:50 +11:00
Mark Vieira	76673b41d1	Provide a mechanism to modify config files in a running test cluster (#117859 ) (#117931 )	2024-12-04 09:15:49 +11:00
Nhat Nguyen	fdade16988	Fix BWC for ES\|QL cluster request (#117865 ) (#117900 ) We identified a BWC bug in the cluster computer request. Specifically, the indices options were not properly selected for requests from an older querying cluster. This caused the search_shards API on the remote cluster to use restricted indices options, leading to failures when resolving wildcard index patterns. Our tests didn't catch this issue because the current BWC tests for cross-cluster queries only cover one direction: the querying cluster on the current version and the remote cluster on a compatible version. This PR fixes the issue and expands BWC tests to support both directions: the querying cluster on the current version with the remote cluster on a compatible version, and vice versa.	2024-12-04 03:33:14 +11:00
Luca Cavanna	f246c80d41	[8.17] Don't skip shards in coord rewrite if timestamp is an alias (#117271 ) (#117855 ) * Don't skip shards in coord rewrite if timestamp is an alias (#117271) The coordinator rewrite has logic to skip indices if the provided date range filter is not within the min and max range of all of its shards. This mechanism is enabled for event.ingested and @timestamp fields, against searchable snapshots. We have basic checks that such fields need to be of date field type, yet if they are defined as alias of a date field, their range will be empty, which indicates that the shards are empty, and the coord rewrite logic resolves the alias and ends up skipping shards that may have matching docs. This commit adds an explicit check that declares the range UNKNOWN instead of EMPTY in these circumstances. The same check is also performed in the coord rewrite logic, so that shards are no longer skipped by mistake. * fix compile	2024-12-03 22:59:56 +11:00
Nhat Nguyen	63dcbb0351	By pass cancellation when closing sinks (#117797 ) (#117871 ) > java.lang.AssertionError: Leftover exchanges ExchangeService{sinks=[veZSyrPATq2Sg83dtgK3Jg:700/3]} on node node_s4 I looked into the test failure described in https://github.com/elastic/elasticsearch/issues/117253. The reason we don't clean up the exchange sink quickly is that, once a failure occurs, we cancel the request along with all its child requests. These exchange sinks will be cleaned up only after they become inactive, which by default takes 5 minutes. We could override the `esql.exchange.sink_inactive_interval` setting in the test to remove these exchange sinks faster. However, I think we should allow exchange requests that close exchange sinks to bypass cancellation, enabling quicker resource cleanup than the default inactive interval. Closes #117253	2024-12-03 16:58:38 +11:00
Nik Everett	0a14f27f27	ESQL: Limit size of `Literal#toString` (#117842 ) (#117849 ) This `toString` is rendered in task output and progress. Let's make sure it's not massive.	2024-12-03 07:27:26 +11:00
Nhat Nguyen	9d7213b83b	Fix CCS cancellation test (#117790 ) (#117800 ) We should have checked that all drivers were canceled, not cancellable (which is always true), before unblocking the compute tasks. Closes #117568	2024-12-01 18:02:33 +11:00
Nikolaj Volgushev	fcd78f85cc	[8.17] Use feature flags in OperatorPrivilegesIT (#117491 ) (#117630 ) * Resolve * Add missing actions	2024-11-28 23:57:18 +11:00
Bogdan Pintea	80a8102a82	ESQL: fix COUNT filter pushdown (#117503 ) (#117654 ) * ESQL: fix COUNT filter pushdown (#117503) If `COUNT` agg has a filter applied, this must also be push down to source. This currently does not happen, but this issue is masked currently by two factors: * a logical optimisation, `ExtractAggregateCommonFilter` that extracts the filter out of the STATS entirely (and pushes it to source then from a `WHERE`); * the phisical plan optimisation implementing the push down, `PushStatsToSource`, currently only applies if there's just one agg function to push down. However, this fix needs to be applied since: * it's still present in versions prior to `ExtractAggregateCommonFilter` introduction; * the defect might resurface when the restriction in `PushStatsToSource` is lifted. Fixes #115522. (cherry picked from commit `560e0c5d04`) * 8.17 adaptation	2024-11-28 23:42:43 +11:00
Martijn van Groningen	7ed32c29a6	[8.17] Add source mode stats to MappingStats (#117697 ) * Add source mode stats to MappingStats (#117463) * update bwc logic for 8.17	2024-11-28 22:37:46 +11:00
Martijn van Groningen	d6fdea4ea8	Update synthetic source legacy license cutoff date. (#117658 ) (#117691 ) Update default cutoff date from 12-12-2024T00:00 UTC to 01-02-2025T00:00 UTC.	2024-11-28 21:12:12 +11:00
Nhat Nguyen	b5a352e06b	Try to finish remote sink once (#117592 ) (#117670 ) Currently, we have three clients fetching pages by default, each with its own lifecycle. This can result in scenarios where more than one request is sent to complete the remote sink. While this does not cause correctness issues, it is inefficient, especially for cross-cluster requests. This change tracks the status of the remote sink and tries to send only one finish request per remote sink.	2024-11-28 09:05:38 +11:00
Max Hniebergall	24b6cd1a65	[ML] Fix for Deberta tokenizer when input sequence exceeds 512 tokens (#117595 ) (#117600 ) * Add test and fix * Update docs/changelog/117595.yaml * Remove test which wasn't working Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-11-28 01:25:48 +11:00
Martijn van Groningen	52318d4beb	[8.17] Add has_custom_cutoff_date to logsdb usage. (#117550 ) (#117614 ) Indicates whether es.mapping.synthetic_source_fallback_to_stored_source.cutoff_date_restricted_override system property has been configured. A follow up from #116647	2024-11-28 00:45:13 +11:00
Luigi Dell'Aquila	f8811a1e9e	ES\|QL: fix stats by constant expresson with alias (#117551 ) (#117613 )	2024-11-27 20:04:34 +11:00
Martijn van Groningen	75a5ad85fc	Adjust SyntheticSourceLicenseService (#116647 ) (#117537 ) * Adjust SyntheticSourceLicenseService (#116647) Allow gold and platinum license to use synthetic source for a limited time. If the start time of a license is before the cut off date, then gold and platinum licenses will not fallback to stored source if synthetic source is used. Co-authored-by: Nikolaj Volgushev <n1v0lg@users.noreply.github.com> * spotless --------- Co-authored-by: Nikolaj Volgushev <n1v0lg@users.noreply.github.com>	2024-11-26 21:06:14 +11:00
Nhat Nguyen	952df62bc2	Deprecate source mode in mappings (#117177 ) (#117527 ) Backport of #116689 to 8.17 This change deprecates _source.mode in mappings, replacing it with the index.mapping.source.mode index setting.	2024-11-25 20:26:52 -08:00
Martijn van Groningen	6b6592ebee	Stop using _source.mode attribute in traces-otel builtin template (#117487 ) (#117492 ) The traces-otel@mappings component template is configured to use logsdb. No need to configure source mode separately.	2024-11-26 06:54:23 +11:00
Rene Groeschke	20a78a18e9	[8.17] [Gradle] Remove static use of BuildParams (#115122 ) (#117433 ) * [Gradle] Remove static use of BuildParams (#115122) Static fields dont do well in Gradle with configuration cache enabled. - Use buildParams extension in build scripts - Keep BuildParams.ci for now for easy serverless migration - Tweak testing doc (cherry picked from commit `13c8aaeffa`) # Conflicts: # TESTING.asciidoc # build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/RestTestBasePlugin.java # build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/compat/compat/AbstractYamlRestCompatTestPlugin.java # build.gradle # modules/ingest-geoip/qa/full-cluster-restart/build.gradle # qa/mixed-cluster/build.gradle # x-pack/plugin/ent-search/qa/full-cluster-restart/build.gradle # x-pack/plugin/eql/qa/rest/build.gradle # x-pack/plugin/fleet/qa/rest/build.gradle # x-pack/plugin/kql/build.gradle # x-pack/plugin/mapper-unsigned-long/build.gradle # x-pack/plugin/ml/qa/multi-cluster-tests-with-security/build.gradle # x-pack/plugin/security/qa/multi-cluster/build.gradle # x-pack/plugin/sql/qa/jdbc/build.gradle # x-pack/plugin/transform/qa/multi-cluster-tests-with-security/build.gradle * Some cleanup * Update build.gradle fix buildparams access	2024-11-25 18:29:26 +01:00
David Kyle	c64620065a	[ML] Explicitly set chunking settings in preconfigured endpoints (#117327 ) (#117501 )	2024-11-26 04:26:26 +11:00
Oleksandr Kolomiiets	d3923d765c	Migrate mapper-related modules to internal-*-rest-test (#117298 ) (#117407 ) (cherry picked from commit `2b8e4e727c`) # Conflicts: # modules/mapper-extras/build.gradle # plugins/mapper-annotated-text/build.gradle # plugins/mapper-murmur3/build.gradle # x-pack/plugin/mapper-unsigned-long/build.gradle # x-pack/plugin/mapper-version/build.gradle # x-pack/plugin/wildcard/build.gradle	2024-11-25 08:01:16 -08:00
Martijn van Groningen	c7ed1563eb	Stop using _source.mode attribute in builtin templates (#117448 ) (#117483 ) Use index.source.mode index setting in builtin templates instead of the deprecated _source.mode mapping attribute.	2024-11-26 02:04:37 +11:00
Adam Demjen	f0166b04f6	[8.17] Add version prefix to Inference Service API path (#117366 ) * Add version prefix to Inference Service API path * Update docs/changelog/117366.yaml	2024-11-25 11:13:08 +01:00
Nikolaj Volgushev	060d42747f	Distinguish `LicensedFeature` by family field (#116809 ) (#117347 ) This PR fixes unintentional licensed feature overlaps for features with the same name but different family fields.	2024-11-25 21:06:23 +11:00
Nhat Nguyen	11283c08b9	Fix leftover exchange in ManyShardsIT (#117309 ) (#117443 ) In the ManyShardsIT#testRejection test, we intercept exchange requests and fail them with EsRejectedExecutionException, verifying that we return a 400 response instead of a 500. The issue with the current test is that if a data-node request never arrives because the whole request was canceled after the exchange request failed—the leftover exchange sink remains until it times out, which defaults to 5 minutes. This change adjusts the test to use a single data node and ensures exchange requests are only failed after the data-node request has arrived. Closes #112406 Closes #112418 Closes #112424	2024-11-25 15:23:14 +11:00
Nhat Nguyen	2734b8c46d	Fix testCancelRequestWhenFailingFetchingPages (#117437 ) (#117439 ) Each data-node request involves two exchange sinks: an external one for fetching pages from the coordinator and an internal one for node-level reduction. Currently, the test selects one of these sinks randomly, leading to assertion failures. This update ensures the test consistently selects the external exchange sink. Closes #117397	2024-11-25 13:50:14 +11:00
Nhat Nguyen	f4c53b8509	Fix CCS exchange when multi cluster aliases point to same cluster (#117297 ) (#117388 ) [esql] > Unexpected error from Elasticsearch: illegal_state_exception - sink exchanger for id [ruxoDDxXTGW55oIPHoCT-g:964613010] already exists. This issue occurs when two or more clusterAliases point to the same physical remote cluster. The exchange service assumes the destination is unique, which is not true in this topology. This PR addresses the problem by appending a suffix using a monotonic increasing number, ensuring that different exchanges are created in such cases. Another issue arising from this behavior is that data on a remote cluster is processed multiple times, leading to incorrect results. I can work on the fix for this once we agree that this is an issue.	2024-11-23 08:23:56 +11:00
Oleksandr Kolomiiets	711eb919ed	Fix constand_keyword test run and properly test recent behavior change (#117284 ) (#117370 )	2024-11-23 06:27:14 +11:00
Mike Pellegrini	c589135491	Always Emit Inference ID in Semantic Text Mapping (#117294 ) (#117344 )	2024-11-23 02:06:05 +11:00
Bogdan Pintea	193417f420	Add docs for aggs filtering (#116681 ) (#117334 ) Add documentation for aggs filtering (the WHERE in STATS command). Fixes: #115083	2024-11-23 00:30:32 +11:00
Slobodan Adamović	210d6ad5fd	Upgrade Bouncy Castle FIPS dependencies (#112989 ) (#117320 ) This PR updates `bc-fips` and `bctls-fips` dependencies to the latest minor versions.	2024-11-23 00:03:07 +11:00
Luigi Dell'Aquila	32eeb6b279	ES\|QL: fix validation of SORT by aggregate functions (#117316 ) (#117325 )	2024-11-22 23:21:00 +11:00
Nhat Nguyen	0a70433209	Limit thread queue during init in ExchangeSource (#117273 ) (#117285 ) ES\|QL doesn't work well with 500 clusters or clusters with 500 nodes. The reason is that we enqueue three tasks to the thread pool queue, which has a limit of 1000, during the initialization of the exchange for each target (cluster or node). This simple PR reduces it to one task. I'm considering using AsyncProcessor for these requests, but that will be a follow-up issue for later.	2024-11-22 10:11:18 +11:00
Oleksandr Kolomiiets	13e5d4ede9	Change synthetic source logic for constant_keyword (#117182 ) (#117288 ) * Change synthetic source logic for constant_keyword * Update docs/changelog/117182.yaml (cherry picked from commit `3a1bc05ad0`)	2024-11-21 13:57:31 -08:00
Max Hniebergall	5341a131e9	[ML] Fix deberta tokenizer bug caused by bug in normalizer (#117189 ) (#117260 ) * Fix deberta tokenizer bug caused by bug in normalizer which caused offesets to be negative * Update docs/changelog/117189.yaml (cherry picked from commit `5500a5ec68`)	2024-11-22 03:06:19 +11:00
Carlos Delgado	0304b92ccf	ESQL - match operator included in non-snapshot builds (#116819 ) (#117227 ) (cherry picked from commit `ea4b41fca8`)	2024-11-21 19:34:50 +11:00

1 2 3 4 5 ...

18950 commits