This commit adds support for system data streams reindexing. The system data stream migration extends the existing system indices migration task and uses the data stream reindex API.
The system index migration task starts a reindex data stream task and tracks its status every second. Only one system index or system data stream is migrated at a time. If a data stream migration fails, the entire system index migration task will also fail.
* Fix concurrency issue in ScriptSortBuilder (#123757)
Inter-segment concurrency is disabled whenever sort by field, included script sorting, is used in a search request.
The reason why sort by field does not use concurrency is that there are some performance implications, given that the hit queue in Lucene is build per slice and the different search threads don't share information about the documents they have already visited etc.
The reason why script sort has concurrency disabled is that the script sorting implementation is not thread safe. This commit addresses such concurrency issue and re-enables search concurrency for search requests that use script sorting. In addition, missing tests are added to cover for sort scripts that rely on _score being available and top_hits aggregation with a scripted sort clause.
* iter
* ESQL: Lazy collection copying during node transform
A set of optimization for tree traversal:
1. perform lazy copying during children transform
2. use long hashing to avoid object creation
3. perform type check first before collection checking
Relates #124395
Read dimension values once per tsid/bucket docid range instead of for each document being processed.
The dimension value within a bucket-interval docid range is always to same and this avoids unnecessary reads.
Latency of downsampling the tsdb track index into a 1 hour interval downsample index drop by ~16% (running on my local machine).
* Retry ILM async action after reindexing data stream (#124149)
When reindexing a data stream, the ILM metadata is copied from the index metadata of the source index to the destination index. But the ILM state of the new index can be stuck if the source index was in an AsyncAction at the time of reindexing. To un-stick the new index, we call TransportRetryAction to retry the AsyncAction. In the past this action would only run if the index were in the error phase. This change includes an update to TransportRetryAction, which allows it to be run when the index is not in an error phase, if the parameter requireError is set to false.
(cherry picked from commit 10a8dcf0fb)
# Conflicts:
# server/src/main/java/org/elasticsearch/TransportVersions.java
# x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/security/user/InternalUsers.java
# x-pack/plugin/ilm/src/main/java/org/elasticsearch/xpack/ilm/action/TransportRetryAction.java
# x-pack/qa/rolling-upgrade/src/test/java/org/elasticsearch/upgrades/DataStreamsUpgradeIT.java
* index mode cannot be set on v7 indices
On index creation, its possible to configure an hunspell analyzer, but
reference a locale file that actually doesn't exist or isn't accessible.
This error, like our other user dictionary errors, should be an IAE not
an ISE.
closes: https://github.com/elastic/elasticsearch/issues/123729
(cherry picked from commit a92b1d6892)
* Fix Gradle Deprecation warning as declaring an is- property with a Boolean type has been deprecated.
* Make use of new layout.settingsFolder api to address some cross project references
* Fix buildParams snapshot check for multiprojet projects
(cherry picked from commit e19b2264af)
# Conflicts:
# build-tools-internal/gradle/wrapper/gradle-wrapper.properties
# build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/BaseInternalPluginBuildPlugin.java
# build-tools-internal/src/main/resources/minimumGradleVersion
# docs/build.gradle
# gradle/wrapper/gradle-wrapper.properties
# plugins/examples/gradle/wrapper/gradle-wrapper.properties
# qa/lucene-index-compatibility/build.gradle
# x-pack/qa/multi-project/core-rest-tests-with-multiple-projects/build.gradle
# x-pack/qa/multi-project/xpack-rest-tests-with-multiple-projects/build.gradle
* [ML] Retry on streaming errors (#123076)
We now always retry based on the provider's configured retry logic
rather than the HTTP status code. Some providers (e.g. Cohere,
Anthropic) will return 200 status codes with error bodies, others (e.g.
OpenAI, Azure) will return non-200 status codes with non-streaming
bodies.
Notes:
- Refactored from HttpResult to StreamingHttpResult, the byte body is
now the streaming element while the http response lives outside the
stream.
- Refactored StreamingHttpResultPublisher so that it only pushes byte
body into a queue.
- Tests all now have to wait for the response to be fully consumed
before closing the service, otherwise the close method will shut down
the mock web server and apache will throw an error.
* [CI] Auto commit changes from spotless
* Use old isSuccess API
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
This PR adjusts the list of supported ciphers to reflect ciphers
available in JDK 24.
JDK 24 [drops](https://bugs.openjdk.org/browse/JDK-8245545) support for
`TLS_RSA` suites. These ciphers will no longer be supported in
Elasticsearch with a bundled JDK with version >= 24. JDK's of lower
versions will continue to support to dropped ciphers.
I will follow up this PR with a separate docs PR.
* Avoid over collecting in Limit or Lucene Operator (#123296)
Currently, we rely on signal propagation for early termination. For
example, FROM index | LIMIT 10 can be executed by multiple Drivers:
several Drivers to read document IDs and extract fields, and the final
Driver to select at most 10 rows. In this scenario, each Lucene Driver
can independently collect up to 10 rows until the final Driver has
enough rows and signals them to stop collecting. In most cases, this
model works fine, but when extracting fields from indices in the
warm/cold tier, it can impact performance. This change introduces a
Limiter used between LimitOperator and LuceneSourceOperator to avoid
over-collecting. We will also need a follow-up to ensure that we do not
over-collect between multiple stages of query execution.
* Fix compilation after #123784
* fix compile
* fix compile
* This is a copy from the original ticket https://github.com/elastic/elasticsearch/pull/120171
* Update docs/reference/search/search-your-data/paginate-search-results.asciidoc
---------
Co-authored-by: Kofi B <kofi.bartlett@elastic.co>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
When IndicesService is closed, the pending deletion may still be in
progress due to indices removed before IndicesService gets closed. If
the deletion stucks for some reason, it can stall the node shutdown.
This PR aborts the pending deletion more promptly by not retry after
IndicesService is stopped.
Resolves: #121717Resolves: #121716Resolves: #122119
(cherry picked from commit c7e7dbe904)
# Conflicts:
# muted-tests.yml
We already disable inter-segment concurrency in SearchSourceBuilder whenever
the top-level sort provided is not _score. We shoudl apply the same rules
in top_hits. We recenly stumbled upon non deterministic behaviour caused by
script sorting defined within top hits. That is to be expected given that
script sorting does not support search concurrency.
The sort script can be replaced with a runtime field, either defined in the
mapping or in the search request, which does support concurrency and guarantees
predictable behaviour.
Rather than checking the license (updating the usage map) on every
single shard, just do it once at the start of a computation that needs
to forecast write loads.
Backport of #123346 to 8.x
Closes#123247
The LuceneSourceOperator is supposed to terminate when it reaches the
limit; unfortunately, we don't have a test to cover this. Due to this
bug, we continue scanning all segments, even though we discard the
results as the limit was reached. This can cause performance issues for
simple queries like FROM .. | LIMIT 10, when Lucene indices are on the
warm or cold tier. I will submit a follow-up PR to ensure we only
collect up to the limit across multiple drivers.
These things can be quite expensive and there's no need to recompute
them in parallel across all management threads as done today. This
commit adds a deduplicator to avoid redundant work.
Backport of #123246 to `8.x`