Commit graph

18950 commits

Author SHA1 Message Date
Martijn van Groningen
2b6a7fed44
Fix issues with ReinitializingSourceProvider (#118370) (#118430)
The previous fix to ensure that each thread uses its own SearchProvider wasn't good enough.  The read from `perThreadProvider` field could be stale and therefore returning a previous source provider.  Instead the source provider should be returned from `provider` local variable.

This change also addresses another issue, sometimes current docid goes backwards compared to last seen docid and this causes issue when synthetic source provider is used, as doc values can't advance backwards. This change addresses that by returning a new source provider if backwards docid is detected.

Closes #118238
2024-12-11 22:28:43 +11:00
Nik Everett
32076cdff5
ESQL: Opt into extra data stream resolution (#118378) (#118390)
* ESQL: Opt into extra data stream resolution

This opts ESQL's data node request into extra data stream resolution.

* Update docs/changelog/118378.yaml
2024-12-11 09:38:54 +11:00
Joe Gallo
eac54624fb
Fix log message format bugs (#118354) (#118386) 2024-12-11 09:09:35 +11:00
Adam Demjen
02e296f286
[8.17] Update sparse text embeddings API route for Inference Service (#118368)
* Update sparse text embeddings API route for Inference Service

* Update docs/changelog/118368.yaml
2024-12-10 13:39:27 -05:00
Martijn van Groningen
d34f525e0e
Update fallback setting (#118237) (#118322)
Update synthetic_source_fallback_to_stored_source setting to be an operator only setting.
2024-12-10 23:25:43 +11:00
Niels Bauman
9cb1f96d3a
[8.17] Fix enrich cache size setting name (#117575) (#118287)
* Fix enrich cache size setting name (#117575)

The enrich cache size setting accidentally got renamed from
`enrich.cache_size` to `enrich.cache.size` in #111412. This commit
updates the enrich plugin to accept both names and deprecates the
wrong name.

* Remove `UpdateForV10` annotation
2024-12-10 05:19:31 +11:00
Nhat Nguyen
c57f1c4e56
Ignore cancellation exceptions (#117657) (#118169) (#118181)
Today, when an ES|QL task encounters an exception, we trigger a
cancellation on the root task, causing child tasks to fail due to
cancellation. We chose not to include cancellation exceptions in the
output, as they are unhelpful and add noise during problem analysis.
However, these exceptions are still slipping through via
RefCountingListener. This change addresses the issue by introducing
ESQLRefCountingListener, ensuring that no cancellation exceptions are
returned.
2024-12-07 06:04:22 +11:00
Martijn van Groningen
94581dbc1f
Fix mocking in SyntheticSourceLicenseServiceTests (#118155) (#118179)
Some mock verifies where missing and `LicenseState#copyCurrentLicenseState(...)` wasn't always mocked.

And because of incorrect mocking the testGoldOrPlatinumLicenseCustomCutoffDate() test had an incorrect assertion.
2024-12-07 04:54:11 +11:00
David Kyle
69a3f84c22
[ML] Wait for the worker service to shutdown before closing task processor (#117920) (#118165) (#118171) 2024-12-07 03:52:35 +11:00
Matteo Piergiovanni
8f4f54e888
fixes and unmutes testSearchableSnapshotShardsAreSkipped... (#118133) (#118139)
(cherry picked from commit 0a2c9fbc29)

# Conflicts:
#	muted-tests.yml
2024-12-06 12:14:15 +01:00
Martijn van Groningen
667a05dacf
Update synthetic source cutoff date (#118069) (#118086)
Updating from 01-02-2025T00:00:00UTC to 04-02-2025T00:00:00UTC
2024-12-06 03:31:46 +11:00
Nik Everett
9cff1ac28f
ESQL: Fix a bug in LuceneQueryExpressionEvaluator (#117252) (#117279) (#118078)
* ESQL: Fix a bug in LuceneQueryExpressionEvaluator

This fixes Lucene usage bug in `LuceneQueryExpressionEvaluator`, the
evaluator we plan to use to run things like `MATCH` when we *can't* push
it to a source operator. That'll be useful for things like:
```
FROM foo
| STATS COUNT(),
        COUNT() WHERE MATCH(message, "error")
```

Explanation:
When using Lucene's `Scorer` and `BulkScorer` you must stay on the same
thread. It's a rule. Most of the time nothing bad happens if you shift
threads, but sometimes things explode and Lucene doesn't work. Driver
can shift from one thread to another - that's just how it's designed.
It's a "yield after running a while" kind of thing.

In tests we sometimes get a version of the `Scorer` and `BulkScorer`
that assert that you don't shift threads. That is what caused this test
failure.

Anyway! This builds protection into `LuceneQueryExpressionEvaluator` so
that if it *does* shift threads then it'll rebuild the `Scorer` and
`BulkScorer`. That makes the test happy and makes even the most grump
Lucene object happy.

Closes #116879
2024-12-06 03:04:45 +11:00
Nikolaj Volgushev
fbb42f19e8
Fix ProfileIntegTests (#117888) (#118060)
The test setup for `ProfileIntegTests` is flawed, where the full name of
a user can be a substring of other profile names (e.g., `SER` is a
substring of `User <random-string>-space1`) -- when that's passed into
suggest call with the `*` space, we get a match on all profiles, instead
of only the one profile expected in the test, since we are matching on
e.g. `SER*`. This PR restricts the setup to avoid the wildcard profile
for that particular test.

Closes: https://github.com/elastic/elasticsearch/issues/117782
2024-12-05 22:26:26 +11:00
Martijn van Groningen
7cb1cbe0f6
[8.17] Address mapping and compute engine runtime field issues (#117792) (#118048)
* Address mapping and compute engine runtime field issues (#117792)

This change addresses the following issues:

Fields mapped as runtime fields not getting stored if source mode is synthetic.
Address java.io.EOFException when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1)
Address concurrency issue when runtime fields get pushed down to Lucene. (2)
1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same SourceProvider evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by SourceProvider. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position.

Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api.

2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same SourceProvider instance then gets access by multiple threads concurrently. SourceProviders implementations are not designed to handle concurrent access.

Closes #117644

* fixed compile error after backporting
2024-12-05 20:52:13 +11:00
Panagiotis Bailis
d4da8ea1a9
[8.17] Fix for propagating filters from compound to inner retrievers (#117914) (#118045)
* Fix for propagating filters from compound to inner retrievers (#117914)

* Update RRFRetrieverBuilderIT.java
2024-12-05 19:43:50 +11:00
Mark Vieira
76673b41d1
Provide a mechanism to modify config files in a running test cluster (#117859) (#117931) 2024-12-04 09:15:49 +11:00
Nhat Nguyen
fdade16988
Fix BWC for ES|QL cluster request (#117865) (#117900)
We identified a BWC bug in the cluster computer request. Specifically, 
the indices options were not properly selected for requests from an
older querying cluster. This caused the search_shards API on the remote
cluster to use restricted indices options, leading to failures when
resolving wildcard index patterns.

Our tests didn't catch this issue because the current BWC tests for 
cross-cluster queries only cover one direction: the querying cluster on
the current version and the remote cluster on a compatible version.

This PR fixes the issue and expands BWC tests to support both 
directions: the querying cluster on the current version with the remote
cluster on a compatible version, and vice versa.
2024-12-04 03:33:14 +11:00
Luca Cavanna
f246c80d41
[8.17] Don't skip shards in coord rewrite if timestamp is an alias (#117271) (#117855)
* Don't skip shards in coord rewrite if timestamp is an alias (#117271)

The coordinator rewrite has logic to skip indices if the provided date range
filter is not within the min and max range of all of its shards. This mechanism
is enabled for event.ingested and @timestamp fields, against searchable snapshots.

We have basic checks that such fields need to be of date field type, yet if they
are defined as alias of a date field, their range will be empty, which indicates
that the shards are empty, and the coord rewrite logic resolves the alias and
ends up skipping shards that may have matching docs.

This commit adds an explicit check that declares the range UNKNOWN instead of EMPTY
in these circumstances. The same check is also performed in the coord rewrite logic,
so that shards are no longer skipped by mistake.

* fix compile
2024-12-03 22:59:56 +11:00
Nhat Nguyen
63dcbb0351
By pass cancellation when closing sinks (#117797) (#117871)
> **java.lang.AssertionError: Leftover exchanges ExchangeService{sinks=[veZSyrPATq2Sg83dtgK3Jg:700/3]} on node node_s4**

I looked into the test failure described in
https://github.com/elastic/elasticsearch/issues/117253. The reason we
don't clean up the exchange sink quickly is that, once a failure occurs,
we cancel the request along with all its child requests. These exchange
sinks will be cleaned up only after they become inactive, which by
default takes 5 minutes.

We could override the `esql.exchange.sink_inactive_interval` setting in
the test to remove these exchange sinks faster. However, I think we
should allow exchange requests that close exchange sinks to bypass
cancellation, enabling quicker resource cleanup than the default
inactive interval.

Closes #117253
2024-12-03 16:58:38 +11:00
Nik Everett
0a14f27f27
ESQL: Limit size of Literal#toString (#117842) (#117849)
This `toString` is rendered in task output and progress. Let's make sure it's not massive.
2024-12-03 07:27:26 +11:00
Nhat Nguyen
9d7213b83b
Fix CCS cancellation test (#117790) (#117800)
We should have checked that all drivers were canceled, not cancellable
(which is always true), before unblocking the compute tasks.

Closes #117568
2024-12-01 18:02:33 +11:00
Nikolaj Volgushev
fcd78f85cc
[8.17] Use feature flags in OperatorPrivilegesIT (#117491) (#117630)
* Resolve

* Add missing actions
2024-11-28 23:57:18 +11:00
Bogdan Pintea
80a8102a82
ESQL: fix COUNT filter pushdown (#117503) (#117654)
* ESQL: fix COUNT filter pushdown (#117503)

If `COUNT` agg has a filter applied, this must also be push down to source. This currently does not happen, but this issue is masked currently by two factors:
* a logical optimisation, `ExtractAggregateCommonFilter` that extracts the filter out of the STATS entirely (and pushes it to source then from a `WHERE`);
* the phisical plan optimisation implementing the  push down, `PushStatsToSource`, currently only applies if there's just one agg function to push down.

However, this fix needs to be applied since:
* it's still present in versions prior to `ExtractAggregateCommonFilter` introduction;
* the defect might resurface when the restriction in `PushStatsToSource` is lifted.

Fixes #115522.

(cherry picked from commit 560e0c5d04)

* 8.17 adaptation
2024-11-28 23:42:43 +11:00
Martijn van Groningen
7ed32c29a6
[8.17] Add source mode stats to MappingStats (#117697)
* Add source mode stats to MappingStats (#117463)

* update bwc logic for 8.17
2024-11-28 22:37:46 +11:00
Martijn van Groningen
d6fdea4ea8
Update synthetic source legacy license cutoff date. (#117658) (#117691)
Update default cutoff date from 12-12-2024T00:00 UTC to 01-02-2025T00:00 UTC.
2024-11-28 21:12:12 +11:00
Nhat Nguyen
b5a352e06b
Try to finish remote sink once (#117592) (#117670)
Currently, we have three clients fetching pages by default, each with
its own lifecycle. This can result in scenarios where more than one
request is sent to complete the remote sink. While this does not cause
correctness issues, it is inefficient, especially for cross-cluster
requests. This change tracks the status of the remote sink and tries to
send only one finish request per remote sink.
2024-11-28 09:05:38 +11:00
Max Hniebergall
24b6cd1a65
[ML] Fix for Deberta tokenizer when input sequence exceeds 512 tokens (#117595) (#117600)
* Add test and fix

* Update docs/changelog/117595.yaml

* Remove test which wasn't working

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-11-28 01:25:48 +11:00
Martijn van Groningen
52318d4beb
[8.17] Add has_custom_cutoff_date to logsdb usage. (#117550) (#117614)
Indicates whether es.mapping.synthetic_source_fallback_to_stored_source.cutoff_date_restricted_override system property has been configured.

A follow up from #116647
2024-11-28 00:45:13 +11:00
Luigi Dell'Aquila
f8811a1e9e
ES|QL: fix stats by constant expresson with alias (#117551) (#117613) 2024-11-27 20:04:34 +11:00
Martijn van Groningen
75a5ad85fc
Adjust SyntheticSourceLicenseService (#116647) (#117537)
* Adjust SyntheticSourceLicenseService (#116647)

Allow gold and platinum license to use synthetic source for a limited time. If the start time of a license is before the cut off date, then gold and platinum licenses will not fallback to stored source if synthetic source is used.

Co-authored-by: Nikolaj Volgushev <n1v0lg@users.noreply.github.com>

* spotless

---------

Co-authored-by: Nikolaj Volgushev <n1v0lg@users.noreply.github.com>
2024-11-26 21:06:14 +11:00
Nhat Nguyen
952df62bc2
Deprecate source mode in mappings (#117177) (#117527)
Backport of #116689 to 8.17

This change deprecates _source.mode in mappings, replacing it with
the index.mapping.source.mode index setting.
2024-11-25 20:26:52 -08:00
Martijn van Groningen
6b6592ebee
Stop using _source.mode attribute in traces-otel builtin template (#117487) (#117492)
The traces-otel@mappings component template is configured to use logsdb. No need to configure source mode separately.
2024-11-26 06:54:23 +11:00
Rene Groeschke
20a78a18e9
[8.17] [Gradle] Remove static use of BuildParams (#115122) (#117433)
* [Gradle] Remove static use of BuildParams (#115122)

Static fields dont do well in Gradle with configuration cache enabled.

- Use buildParams extension in build scripts
- Keep BuildParams.ci for now for easy serverless migration
-  Tweak testing doc

(cherry picked from commit 13c8aaeffa)

# Conflicts:
#	TESTING.asciidoc
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/RestTestBasePlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/compat/compat/AbstractYamlRestCompatTestPlugin.java
#	build.gradle
#	modules/ingest-geoip/qa/full-cluster-restart/build.gradle
#	qa/mixed-cluster/build.gradle
#	x-pack/plugin/ent-search/qa/full-cluster-restart/build.gradle
#	x-pack/plugin/eql/qa/rest/build.gradle
#	x-pack/plugin/fleet/qa/rest/build.gradle
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/mapper-unsigned-long/build.gradle
#	x-pack/plugin/ml/qa/multi-cluster-tests-with-security/build.gradle
#	x-pack/plugin/security/qa/multi-cluster/build.gradle
#	x-pack/plugin/sql/qa/jdbc/build.gradle
#	x-pack/plugin/transform/qa/multi-cluster-tests-with-security/build.gradle

* Some cleanup

* Update build.gradle

fix buildparams access
2024-11-25 18:29:26 +01:00
David Kyle
c64620065a
[ML] Explicitly set chunking settings in preconfigured endpoints (#117327) (#117501) 2024-11-26 04:26:26 +11:00
Oleksandr Kolomiiets
d3923d765c
Migrate mapper-related modules to internal-*-rest-test (#117298) (#117407)
(cherry picked from commit 2b8e4e727c)

# Conflicts:
#	modules/mapper-extras/build.gradle
#	plugins/mapper-annotated-text/build.gradle
#	plugins/mapper-murmur3/build.gradle
#	x-pack/plugin/mapper-unsigned-long/build.gradle
#	x-pack/plugin/mapper-version/build.gradle
#	x-pack/plugin/wildcard/build.gradle
2024-11-25 08:01:16 -08:00
Martijn van Groningen
c7ed1563eb
Stop using _source.mode attribute in builtin templates (#117448) (#117483)
Use index.source.mode index setting in builtin templates instead of the deprecated _source.mode mapping attribute.
2024-11-26 02:04:37 +11:00
Adam Demjen
f0166b04f6
[8.17] Add version prefix to Inference Service API path (#117366)
* Add version prefix to Inference Service API path

* Update docs/changelog/117366.yaml
2024-11-25 11:13:08 +01:00
Nikolaj Volgushev
060d42747f
Distinguish LicensedFeature by family field (#116809) (#117347)
This PR fixes unintentional licensed feature overlaps for features with
the same name but different family fields.
2024-11-25 21:06:23 +11:00
Nhat Nguyen
11283c08b9
Fix leftover exchange in ManyShardsIT (#117309) (#117443)
In the ManyShardsIT#testRejection test, we intercept exchange requests
and fail them with EsRejectedExecutionException, verifying that we
return a 400 response instead of a 500.

The issue with the current test is that if a data-node request never
arrives because the whole request was canceled after the exchange
request failed—the leftover exchange sink remains until it times out,
which defaults to 5 minutes. This change adjusts the test to use a
single data node and ensures exchange requests are only failed after the
data-node request has arrived.

Closes #112406
Closes #112418
Closes #112424
2024-11-25 15:23:14 +11:00
Nhat Nguyen
2734b8c46d
Fix testCancelRequestWhenFailingFetchingPages (#117437) (#117439)
Each data-node request involves two exchange sinks: an external one for
fetching pages from the coordinator and an internal one for node-level
reduction. Currently, the test selects one of these sinks randomly,
leading to assertion failures. This update ensures the test consistently
selects the external exchange sink.

Closes #117397
2024-11-25 13:50:14 +11:00
Nhat Nguyen
f4c53b8509
Fix CCS exchange when multi cluster aliases point to same cluster (#117297) (#117388)
[esql] > Unexpected error from Elasticsearch: illegal_state_exception - sink exchanger for id [ruxoDDxXTGW55oIPHoCT-g:964613010] already exists.

This issue occurs when two or more clusterAliases point to the same 
physical remote cluster. The exchange service assumes the destination is
unique, which is not true in this topology. This PR addresses the
problem by appending a suffix using a monotonic increasing number,
ensuring that different exchanges are created in such cases.

Another issue arising from this behavior is that data on a remote 
cluster is processed multiple times, leading to incorrect results. I can
work on the fix for this once we agree that this is an issue.
2024-11-23 08:23:56 +11:00
Oleksandr Kolomiiets
711eb919ed
Fix constand_keyword test run and properly test recent behavior change (#117284) (#117370) 2024-11-23 06:27:14 +11:00
Mike Pellegrini
c589135491
Always Emit Inference ID in Semantic Text Mapping (#117294) (#117344) 2024-11-23 02:06:05 +11:00
Bogdan Pintea
193417f420
Add docs for aggs filtering (#116681) (#117334)
Add documentation for aggs filtering (the WHERE in STATS command).

Fixes: #115083
2024-11-23 00:30:32 +11:00
Slobodan Adamović
210d6ad5fd
Upgrade Bouncy Castle FIPS dependencies (#112989) (#117320)
This PR updates `bc-fips` and `bctls-fips` dependencies to the latest
minor versions.
2024-11-23 00:03:07 +11:00
Luigi Dell'Aquila
32eeb6b279
ES|QL: fix validation of SORT by aggregate functions (#117316) (#117325) 2024-11-22 23:21:00 +11:00
Nhat Nguyen
0a70433209
Limit thread queue during init in ExchangeSource (#117273) (#117285)
ES|QL doesn't work well with 500 clusters or clusters with 500 nodes. 
The reason is that we enqueue three tasks to the thread pool queue,
which has a limit of 1000, during the initialization of the exchange for
each target (cluster or node). This simple PR reduces it to one task.
I'm considering using AsyncProcessor for these requests, but that will
be a follow-up issue for later.
2024-11-22 10:11:18 +11:00
Oleksandr Kolomiiets
13e5d4ede9
Change synthetic source logic for constant_keyword (#117182) (#117288)
* Change synthetic source logic for constant_keyword

* Update docs/changelog/117182.yaml

(cherry picked from commit 3a1bc05ad0)
2024-11-21 13:57:31 -08:00
Max Hniebergall
5341a131e9
[ML] Fix deberta tokenizer bug caused by bug in normalizer (#117189) (#117260)
* Fix deberta tokenizer bug caused by bug in normalizer which caused offesets to be negative

* Update docs/changelog/117189.yaml

(cherry picked from commit 5500a5ec68)
2024-11-22 03:06:19 +11:00
Carlos Delgado
0304b92ccf
ESQL - match operator included in non-snapshot builds (#116819) (#117227)
(cherry picked from commit ea4b41fca8)
2024-11-21 19:34:50 +11:00