With #123610 we disabled parallel collection for field and script sorted top hits,
aligning its behaviour with that of top level search. This was mainly to work around
a bug in script sorting that did not support inter-segment concurrency.
The bug with script sort has been fixed with #123757 and concurrency re-enabled for it.
While sort by field is not optimized for search concurrency, top hits benefits from it
and disabling concurrency for sort by field in top hits has caused performance
regressions in our nightly benchmarks.
This commit re-enables concurrency for top hits with sort by field is used. This
introduces back a discrepancy between top level search and top hits, in that concurrency
is applied for top hits despite sort by field normally disables it. The key difference
is the context where sorting is applied, and the fact that concurrency is disabled
only for performance reasons on top level searches and not for functional reasons.
Optimize calculating the usage of ILM policies in the `GET _ilm/policy` and `GET _ilm/policy/<policy_id>` endpoints by xtracting a separate class that pre-computes some parts on initialization (i.e. only once per request) and then uses those pre-computed parts when calculating the usage for an individual policy. By precomputing all the usages, the class makes a tradeoff by using a little bit more memory to significantly improve the overall processing time.
Now that all actions that DLM depends on are project-aware, we can make DLM itself project-aware.
There still exists only one instance of `DataStreamLifecycleService`, it just loops over all the projects - which matches the approach we've taken for similar scenarios thus far.
We have some tolerance wound how many bytes we report for these completion fields. But the
values depend on the distribution of the random values that determine how many docs get
an option field. This commit makes the test more precise by computing the real ratio
between docs that have the optional field and the total number of docs, so that we
can base assertion on more realistic expectations.
Closes#123269
Fixes an issue where indexing throttling kicks in while disk IO is throttling.
Instead disk IO should first unthrottle, and only then, if we still can't keep up with the merging load, start throttling indexing.
Fixes elastic/elasticsearch-benchmarks#2437
Relates #120869
This change moves the query phase a single roundtrip per node just like can_match or field_caps work already.
A a result of executing multiple shard queries from a single request we can also partially reduce each node's query results on the data node side before responding to the coordinating node.
As a result this change significantly reduces the impact of network latencies on the end-to-end query performance, reduces the amount of work done (memory and cpu) on the coordinating node and the network traffic by factors of up to the number of shards per data node!
Benchmarking shows up to orders of magnitude improvements in heap and network traffic dimensions in querying across a larger number of shards.
This patch builds on the work in #113757, #122999, #124594, #125529, and
#125709 to natively store array offsets for scaled float fields instead of
falling back to ignored source when synthetic_source_keep: arrays.
Currently, the Lucene90DocValuesProducer uses optimized IntObjectHashMaps
to track various entries for each field, while the
ES87TSDBDocValuesProducer uses regular HashMap<String, Object>. This patch
updates the ES87TSDBDocValuesProducer class to also use the optimized
hash maps.
This adds cluster settings to allow for a choice of write load metrics
in the data stream auto-sharding calculations. There are separate
settings for the increasing and decreasing calculations. Both default
to the existing 'all-time' metric for now.
This also refactors `DataStreamAutoShardingServiceTests`. The main two things done are:
- Split large test methods which do several independent tests in
blank code blocks into more smaller methods.
- Fix an unnecessarily complicated pattern where the code would
create a `Function` in a local variable and then immediately
`apply` it exactly once... rather than just executing the code
normally.
Appends the FailedShardEntry request to the 'shard-failed'
task source string in ShardFailedTransportHandler.messageReceived().
This information will now be available in the 'source' string for
shard failed task entries in the Cluster Pending Tasks API response.
This source string change matches what is done in the
ShardStartedTransportHandler.
Closes#102606.
The CCS is currently not supported for failure store backing indices.
This PR adjusts the selector parsing (introduced in #118614) to prevent
using `::failures` and `::data` selectors with cross-cluster expressions.
For example, `GET my_remote_cluster:logs-*::failures/_search` request
will fail early, during expression parsing.
To test manually, run `./gradlew run-ccs` and execute the example request.
We have seen occasional failures with the existing tolerance, in
`testEwmr_threadSafe` which contains some randomness. We were
asserting the result of a lot of f.p. operations with a tolerance of
1.0e-13. The highest error I saw in any of the reported failures was less than
1.2e-13. This PR increases the tolerance to 2.0e-13 which should allow
all those to pass.
Fixes#124692
This tracks the highest value seen for the recent write load metric
any time the stats for a shard was computed, exposes this value
alongside the recent value, and persists it in index metadata
alongside it too.
The new test in `IndexShardTests` is designed to more thoroughly test
the recent write load metric previously added, as well as to test the
peak metric being added here.
ES-10037 #comment Added peak load metric in https://github.com/elastic/elasticsearch/pull/125521
This PR adds project-id to both SnapshotsInProgress and Snapshot so that
they are aware of projects and ready to handle snapshots from multiple
projects.
Relates: ES-10224
This patch builds on the work in #113757, #122999, #124594, and #125529 to
natively store array offsets for unsigned long fields instead of falling
back to ignored source when synthetic_source_keep: arrays.
This allows a `rescore_vector: {oversample: 0}` to indicate bypassing
oversampling and rescoring.
This is useful for:
- Updating a quantized mapping to turn off automatic rescoring
- Bypassing oversampling at query time in an ad-hoc manner if its on by default in the mapping
closes: https://github.com/elastic/elasticsearch/issues/125157
Load field caps from store if they haven't been initialised through a refresh yet.
Keep the plain reads to not mess with performance characteristics too much on the good path but protect against confusing races when loading field infos now (that probably should have been ordered stores in the first place but this was safe due to other locks/volatiles on the refresh path).
Closes#125483
* Specify index component when retrieving lifecycle
* Add getters for the failure lifecycle
* Conceptually introduce the failure store lifecycle (even for now it's the same)
We often call `addTemporaryStateListener` with the `ClusterService` of a
random node, or the currently elected master. This commit adds utilities
for this common pattern.
Adds the `original_types` to the description of ESQL's `unsupported`
fields. This looks like:
```
{
"name" : "a",
"type" : "unsupported",
"original_types" : [
"long",
"text"
]
}
```
for union types. And like:
```
{
"name" : "a",
"type" : "unsupported",
"original_types" : [
"date_range"
]
}
```
for truly unsupported types.
This information is useful for the UI. For union types it can suggest
that users append a cast.
This patch builds on the work in #113757, #122999, and #124594 to natively
store array offsets for boolean fields instead of falling back to ignored
source when `synthetic_source_keep: arrays`.
Today shard's engine mutation are guarded by an engineMutex object monitor. But we would like to be able to execute one or more operations on an engine instance, without this instance being resetted during the execution of the operation.
In order to do that, this change replaces the engineMutex by a reentrant read/write lock and introduces two new methods IndexShard#withEngine() and IndexShard#withEngineOrNull() that can be used to execute an operation while avoiding the current engine instance to be reset. It does not prevent it to be closed during execution though.
Relates ES-10826
Co-authored-by: Francisco Fernández Castaño <francisco.fernandez.castano@gmail.com>
This PR updates the ES|QL grammar to include the selector portion of an index pattern. Patterns
are recombined before being sent to field caps. Field caps already supports this functionality, so
this is primarily wiring it up where needed.