Commit graph

16434 commits

Author SHA1 Message Date
Tim Grein
f054dca0b3 Add working dense text embeddings integration with default endpoint. Some tests WIP 2025-06-23 13:19:47 +02:00
Luca Cavanna
a6ffeeeb71
Remove outdated TODO from TopDocsAndMaxScore (#126386)
There are no plans to remove max_score, as highlighted in #32981 .
This commit removes a related TODO given we don't plan on addressing it.
2025-04-08 10:41:38 +02:00
Ignacio Vera
47e352fda0
Throw nicer exception in SpanBooleanQueryRewriteWithMaxClause (#126387)
Throw an ElasticsearchStatusException with a RestStatus.BAD_REQUEST code instead of a generic RuntimeException.
2025-04-08 06:42:00 +02:00
Yang Wang
997a7b8fab
FileWatchingService should not throw for missing file (#126264)
Missing file is a valid state for FileWatchingService so that the
exception should be suppressed.
2025-04-08 09:56:35 +10:00
Ryan Ernst
991e80d56e
Remove unnecessary generic params from action classes (#126364)
Transport actions have associated request and response classes. However,
the base type restrictions are not necessary to duplicate when creating
a map of transport actions. Relatedly, the ActionHandler class doesn't
actually need strongly typed action type and classes since they are lost
when shoved into the node client map. This commit removes these type
restrictions and generic parameters.
2025-04-07 16:22:56 -07:00
David Turner
cedcb5ccfe
Replace TransportResponse.Empty with ActionResponse.Empty (#126400)
No need to distinguish these things any more, we can just use
`ActionResponse.Empty` everywhere.
2025-04-08 06:58:06 +10:00
Joe Gallo
bead858ccd
Correctly handle nulls in nested paths in the remove processor (#126417) 2025-04-07 16:54:07 -04:00
Jeremy Dahlgren
79297438ed
Avoid extra allocations in RestGetAliasesAction (#126177)
When no explicit aliases are provided in the call there is no need
to collect the index names or aliases into HashSets if they won't be
used. Also fixed where the index name was being added for each
loop of the alias list.
2025-04-07 15:02:05 -04:00
David Turner
5dc7ab77b3
Remove usages of TransportMessage (#126375)
This base class is kinda pointless: everywhere it's used we can either
be more specific (e.g. choosing between `TransportRequest` or
`TransportResponse`) or more general (e.g. choosing `Writeable`). This
commit removes all the usages apart from the `extends` clauses of its
direct descendants.
2025-04-08 03:50:28 +10:00
Oleksandr Kolomiiets
21ff72bef4
Use FallbackSyntheticSourceBlockLoader for text fields (#126237) 2025-04-07 09:32:35 -07:00
Moritz Mack
0360db2cd0
Improved reproduction of scaling EsExecutors bug #124667 to work with max pool size > 1. (#125045)
Relates to #124867, ES-10640
2025-04-07 15:40:25 +02:00
Pete Gillin
549fddb348
ES-10037 Tweak wording of autosharding logs (#126339)
ES-10037 #comment Tweaked wording of logging in https://github.com/elastic/elasticsearch/pull/126339
2025-04-07 13:15:52 +01:00
David Turner
527d2a203b
Improve handling of empty response (#125562)
Today `ActionResponse$Empty` implements `ToXContentObject`, but yields
no bytes of content when serialized which creates an invalid JSON
response. This commit removes the bogus interface and adjusts the
affected REST APIs to send a `text/plain` response instead.
2025-04-07 12:10:07 +01:00
Slobodan Adamović
0b09506b54
Improve error handling during index expressions resolving (#126018)
The `InvalidIndexNameException` exception was wrapped in a `ElasticsearchSecurityException`, which returns HTTP `403` status. 

This exception (along with newly introduced `InvalidSelectorException` and `UnsupportedSelectorException`) can be raised during index expression resolving due to an invalid user input and should result in HTTP `400` response instead.

This PR changes exception handling to avoid wrapping them in the `ElasticsearchSecurityException`.
2025-04-06 07:09:24 +02:00
Nhat Nguyen
fbfc707d95
Support serialization of aggregate metric double literal (#126352)
To backport the newly introduced AggregateMetricDoubleLiteral to 8.x, we need
to override the supportsVersion method instead of getMinimalSupportedVersion.
2025-04-04 20:50:10 -07:00
Tim Brooks
68bc2b8600
Adjust method to transition split state (#126179)
Make the method to transition to handoff be generic and support multiple
states.
2025-04-04 15:20:07 -06:00
Jordan Powers
4c174a891f
Use Lucene101 postings format by default (#126080)
Update the PerFieldFormatSupplier so that new standard indices use the
Lucene101PostingsFormat instead of the current default ES812PostingsFormat.

Currently, use of the new codec is gated behind a feature flag.
2025-04-04 12:41:27 -07:00
Pete Gillin
78aff25b05
ES-10037 Periodic logging in autosharding service (#126171)
This enhances DataStreamAutoShardingService so that it periodically
logs at INFO level the most 'interesting' results it has produced in
the last period.

In this PR, the most 'interesting' results are considered to be the
ones with the highest load, keeping track separately of the top 10
which resulting in an increase decision and the top 10 which did
not. In practice, increase recommendations are sufficiently rare that
the top 10 will often be 'all of them', and they are all potentially
interesting (but we cap it to protect against taking up an unbounded
amount of memory). We keep the high load non-increase recommendations
as well, since these are likely to be the interesting ones to look at
when investigating why some data stream did not get an increase shards
recommendation when we might have expected it.

The mechanism would be easily extended to add in other categories. For
example, we may find that there are some categories of decrease
decisions we consider 'interesting'. (N.B. The rollover service will
not roll over just because the auto-sharding service recommended
down-sharding — down-sharding only happens if the rollover was going
to happen for some other reason (age, size, etc.) So it's normal for
the auto-sharding service to return decrease recommendations for the
same data streams every 5 minutes until those other conditions are
met. Which makes things a bit more complicated.) This PR just covers
the cases that seem likely to be useful in the support cases we have
seen.

The existing DEBUG and TRACE log lines in the service are replaced
with a single DEBUG log which pulls together all the data. This is an
improvement, since at the moment it is hard to figure out from the
logs which lines refer to the same data stream (they are interleaved,
and don't all include the data stream name).

The write load field in the AutoShardingResult was unused, and is
removed.

ES-10037 #comment Improved logging in https://github.com/elastic/elasticsearch/pull/126171
2025-04-04 19:08:39 +01:00
Alexey Ivanov
fd7efe587e
[main] Move system indices migration to migrate plugin (#125437)
* [main] Move system indices migration to migrate plugin

It seems the best way to fix #122949 is to use existing data stream reindex API. However, this API is located in the migrate x-pack plugin. This commit moves the system indices migration logic (REST handlers, transport actions, and task) to the migrate plugin.

Port of #123551

* [CI] Auto commit changes from spotless

* Fix compilation

* Fix tests

* Fix test

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-04-04 18:49:38 +01:00
Armin Braun
c06cbb6496
Optimize slicing ReleasableBytesReference to avoid needless retention of unused pages (#126284)
We can make slicing things like search hits where we cut many small
buffers out of a large composite reference a lot more efficient in some
cases by making the slice of a releasable reference itself releasable.
This fixes the current situation where we could have composite buffer
of many MB that is made up of e.g. 16k chunks but that we would retain
in full if we were to slice even a single byte out of it in any position.
2025-04-04 18:16:04 +02:00
Christoph Büscher
e55b97270b
Fix failing SearchServiceSingleNodeTests testSlicingBehaviourForParallelCollection (#126300)
We need to register the settings update consumer for
QUERY_PHASE_PARALLEL_COLLECTION_ENABLED regardless of the
BATCHED_QUERY_PHASE_FEATURE_FLAG feature flag.

Closes #125899
2025-04-04 16:50:49 +02:00
Jeremy Dahlgren
4c979aa365
Accumulate compute() calls and iterations between convergences in DesiredBalanceComputer (#126008)
Add tracking of the number of compute() calls and total iterations
between convergences in the DesiredBalanceComputer, along with the
time since the last convergence.  These are included in the log
message when the computer doesn't converge.

Closes #100850.
2025-04-04 08:33:17 -04:00
Armin Braun
8199c831b2
Revert "Flip batched execution flag to false for test (#126228)" (#126281)
This reverts commit d9bc3b97eb.
2025-04-04 12:54:30 +01:00
Mikhail Berezovskiy
70654a3633
Add GCS telemtry with ThreadLocal (#125452) 2025-04-03 23:46:06 -07:00
Kathleen DeRusso
e7d4a28a87
Support configurable chunking in semantic_text fields (#121041)
* test

* Revert "test"

This reverts commit 9f4e2adba0.

* Refactor InferenceService to allow passing in chunking settings

* Add chunking config to inference field metadata and store in semantic_text field

* Fix test compilation errors

* Hacking around trying to get ingest to work

* Debugging

* [CI] Auto commit changes from spotless

* POC works and update TODO to fix this

* [CI] Auto commit changes from spotless

* Refactor chunking settings from model settings to field inference request

* A bit of cleanup

* Revert a bunch of changes to try to narrow down what broke CI

* test

* Revert "test"

This reverts commit 9f4e2adba0.

* Fix InferenceFieldMetadataTest

* [CI] Auto commit changes from spotless

* Add chunking settings back in

* Update builder to use new map

* Fix compilation errors after merge

* Debugging tests

* debugging

* Cleanup

* Add yaml test

* Update tests

* Add chunking to test inference service

* Trying to get tests to work

* Shard bulk inference test never specifies chunking settings

* Fix test

* Always process batches in order

* Fix chunking in test inference service and yaml tests

* [CI] Auto commit changes from spotless

* Refactor - remove convenience method with default chunking settings

* Fix ShardBulkInferenceActionFilterTests

* Fix ElasticsearchInternalServiceTests

* Fix SemanticTextFieldMapperTests

* [CI] Auto commit changes from spotless

* Fix test data to fit within bounds

* Add additional yaml test cases

* Playing with xcontent parsing

* A little cleanup

* Update docs/changelog/121041.yaml

* Fix failures introduced by merge

* [CI] Auto commit changes from spotless

* Address PR feedback

* [CI] Auto commit changes from spotless

* Fix predicate in updated test

* Better handling of null/empty ChunkingSettings

* Update parsing settings

* Fix errors post merge

* PR feedback

* [CI] Auto commit changes from spotless

* PR feedback and fix Xcontent parsing for SemanticTextField

* Remove chunking settings check to use what's passed in from sender service

* Fix some tests

* Cleanup

* Test failure whack-a-mole

* Cleanup

* Refactor to handle memory optimized bulk shard inference actions - this is ugly but at least it compiles

* [CI] Auto commit changes from spotless

* Minor cleanup

* A bit more cleanup

* Spotless

* Revert change

* Update chunking setting update logic

* Go back to serializing maps

* Revert change to model settings - source still errors on missing model_id

* Fix updating chunking settings

* Look up model if null

* Fix test

* Work around https://github.com/elastic/elasticsearch/issues/125723 in semantic text field serialization

* Add BWC tests

* Add chunking_settings to docs

* Refactor/rename

* Address minor PR feedback

* Add test case for null update

* PR feedback - adjust refactor of chunked inputs

* Refactored AbstractTestInferenceService to return offsets instead of just Strings

* [CI] Auto commit changes from spotless

* Fix tests where chunk output was of size 3

* Update mappings per PR feedback

* PR Feedback

* Fix problems related to merge

* PR optimization

* Fix test

* Delete extra file

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-04-03 17:45:26 -04:00
Sam Xiao
b6c6db9861
Add multi-project support for health indicator data_stream_lifecycle (#126056) 2025-04-03 16:26:22 -04:00
Nhat Nguyen
b0d7c7d102
Add logical/physical plans time-series aggregate (#126178)
We need to store extra information for Aggregate and AggregateExec for 
time-series aggregations. Previously, I added a type (standard or
time_series), but this was not enough. This PR removes it and replaces
it with extensions of Aggregate and AggregateExec. I considered adding
an extra map of metadata to Aggregate and AggregateExec, but this
approach seems simpler.
2025-04-03 11:36:43 -07:00
Ben Chaplin
9f6eb1d4e3
Log stack traces on data nodes before they are cleared for transport (#125732)
We recently cleared stack traces on data nodes before transport back to the coordinating node when error_trace=false to reduce unnecessary data transfer and memory on the coordinating node (#118266). However, all logging of exceptions happens on the coordinating node, so stack traces disappeared from any logs. This change logs stack traces directly on the data node when error_trace=false.
2025-04-03 13:45:09 -04:00
Armin Braun
d9bc3b97eb
Flip batched execution flag to false for test (#126228)
Disabling this to illustrate a point in nightly ccs runs.
2025-04-03 17:25:16 +01:00
kanoshiou
30b2a1f729
ESQL: Enhanced DATE_TRUNC with arbitrary intervals (#120302)
Originally, `DATE_TRUNC` only supported 1-month and 3-month intervals for months, and 1-year interval for years, while arbitrary intervals were supported for weeks and days. This PR adds support for `DATE_TRUNC` with arbitrary month and year intervals. 

Closes #120094
2025-04-03 16:55:56 +02:00
Mary Gouseti
95257bbf07
Make data stream options multi-project aware (#126141) 2025-04-03 14:33:40 +03:00
Albert Zaharovits
0faa960aa2
Fix IndexStatsIT (#126113)
Ensures proper cleanup in the testThrottleStats test.

Fixes #125910 #125907 #125912
2025-04-03 14:09:38 +03:00
Martijn van Groningen
717d00d96d
Fix TsdbDocValueBwcTests test failure. (#126182)
Don't perform version check assertion in TsdbDocValueBwcTests if security manager is active.

By default, with jvm version 24 entitlements are used instead security manager and assertOldDocValuesFormatVersion() / assertNewDocValuesFormatVersion() work as expected.

Making these methods work with security manager would require granting the server entire test codebase suppressAccessChecks and suppressAccessChecks privileges. This is undesired from a security manager perspective. Instead, only assert doc values format checks if security manager isn't active, which is always the case jvm version 24 or higher is used.

Closes #126174
2025-04-03 12:19:59 +02:00
Albert Zaharovits
ecaa0b1f65
Fix ThreadPoolMergeSchedulerTests testSchedulerCloseWaitsForRunningMerge (#126110)
Fixes #125236
2025-04-03 11:11:55 +03:00
Pawan Kartik
c54c3afb42
Add transport version entry for backport (#126168) 2025-04-02 20:40:34 +01:00
Oleksandr Kolomiiets
daed78308c
Fix KeywordFieldBlockLoaderTests (#126146) 2025-04-02 11:59:22 -07:00
Albert Zaharovits
01a1f454e1
Unmute ThreadContextTests testDropWarningsExceedingMaxSettings (#123629) 2025-04-02 21:10:29 +03:00
Stanislav Malyshev
d84b65d38b
Ensure the set of remote clusters is consistent over the life of ES|QL query (#126000)
* Ensure the set of remote clusters is consistent over the life of ES|QL query
2025-04-02 11:46:04 -06:00
Niels Bauman
483f97915c
Run TransportGetIndexAction on local node (#125652)
This action solely needs the cluster state, it can run on any node.
Since this is the last class/action that extends the `ClusterInfo`
abstract classes, we remove those classes too as they're not required
anymore.

Relates #101805
2025-04-02 18:41:35 +01:00
Pawan Kartik
e4fb22c4f3
ES|QL: Wrap remote errors with cluster name to provide more context (#123156)
Wrap remote errors with cluster name to provide more context

Previously, if a remote encountered an error, user would see a top-level error that would provide no context about which remote ran into the error. Now, such errors are wrapped in a separate remote exception whose error message clearly specifies the name of the remote cluster and the error that occurred is the cause of this remote exception.
2025-04-02 18:08:20 +01:00
Niels Bauman
509a12058f
Run TransportGetLifecycleAction on local node (#126002)
This action solely needs the cluster state, it can run on any node.

Relates #101805
2025-04-02 16:35:25 +01:00
Albert Zaharovits
e93460040d
Fix testMergeSourceWithFollowUpMergesRunSequentially (#126050)
Fixes #125639
Relates #120869
2025-04-02 17:13:46 +03:00
Niels Bauman
eb4d64f94a
Run TransportGetSettingsAction on local node (#126051)
This action solely needs the cluster state, it can run on any node.
Additionally, it needs to be cancellable to avoid doing unnecessary work
after a client failure or timeout.

Relates #101805
2025-04-02 15:05:31 +01:00
Albert Zaharovits
edc5379e6a
Fix ThreadPoolMergeSchedulerStressTestIT testMergingFallsBehindAndThenCatchesUp (#125956)
We don't know how many semaphore merge permits we need to release, or how many are already released.

Fixes #125744
2025-04-02 16:07:18 +03:00
Martijn van Groningen
52d68392d0
Prepare tsdb doc values format for merging optimizations. (#125933)
The change contains the following changes:

- The numDocsWithField field moved from SortedNumericEntry to NumericEntry. Making this statistic always available.
- Store jump table after values in ES87TSDBDocValuesConsumer#writeField(...). Currently it is stored before storing values. This will allow us later to iterate over the SortedNumericDocValues once. When merging, this is expensive as a merge sort on the fly is being executed.

This change will allow all the optimizations that are listed in #125403
2025-04-02 13:39:41 +02:00
Albert Zaharovits
1f0551a995
Slack merge throttling params for fewer merge tasks (#126016)
The intent here is to aim for fewer to-do merges enqueued for execution,
and to unthrottle disk IO at a faster rate when the queue grows longer.
Overall this results in less merge disk throttling.

Relates https://github.com/elastic/elasticsearch-benchmarks/issues/2437 https://github.com/elastic/elasticsearch/pull/120869
2025-04-02 12:36:49 +03:00
Johannes Fredén
95cf1450f4
Add getSecret method to ProjectMetadata (#125830)
* Add getSecret method to ProjectState
2025-04-02 08:56:36 +02:00
Dimitris Rempapis
69f388c391
Unmute test and fix for SearchWithRandomDisconnectsIT::testSearchWithRandomDisconnects (#125838)
Provide a fix for a test execution and unmute the test.
2025-04-02 09:37:25 +03:00
Keith Massey
bb762107b6
Preventing ConcurrentModificationException when updating settings for more than one index (#126077) 2025-04-01 17:10:08 -05:00
Oleksandr Kolomiiets
f3ccde6959
Use FallbackSyntheticSourceBlockLoader for point and geo_point (#125816) 2025-04-01 12:55:18 -07:00