There were some optimizations that broke collapse fields automatically
being added to `docvalue_fields` during the fetch phase.
Consequently, users will get really weird errors like
`unsupported_operation_exception`. This commit corrects the intended
behavior of automatically including the collapse field in the
docvalue_fields context during fetch if it isn't already included.
closes: https://github.com/elastic/elasticsearch/issues/96510
Let’s say we have `my-metrics` data stream which is receiving a lot of
indexing requests. The following scenario can result in multiple
unnecessary rollovers:
1. We update the mapping and mark it to be lazy rolled over
2. We receive 5 bulk index requests that all contain a write request for this data stream.
3. Each of these requests are being picked up “at the same time”, they see that the data stream needs to be rolled over and they issue a lazy rollover request.
4. Currently, data stream my-metrics has 5 tasks executing an unconditional rollover.
5. The data stream gets rolled over 5 times instead of one.
This scenario is captured in the `LazyRolloverDuringDisruptionIT`.
We have witnessed this also in the wild, where a data stream was rolled
over 281 times extra resulting in 281 empty indices.
This PR proposes:
- To create a new task queue with a more efficient executor that further batches/deduplicates the requests.
- We add two safe guards, the first to ensure we will not enqueue the rollover task if we see that a rollover has occurred already. The second safe guard is during task execution, if we see that the data stream does not have the `rolloverOnWrite` flag set to `true` we skip the rollover.
- When we skip the rollover we return the following response:
```
{
"acknowledged": true,
"shards_acknowledged": true,
"old_index": ".ds-my-data-stream-2099.05.07-000002",
"new_index": ".ds-my-data-stream-2099.05.07-000002",
"rolled_over": false,
"dry_run": false,
"lazy": false,
}
```
* Fix `TasksIT#testTasksCancellation` (#109929)
The tasks are removed from the task manager _after_ sending the
response, so we cannot reliably assert they're done. With this commit we
wait for them to complete properly first.
Closes#109686
* Introduce safeGet
When accessing array elements from a script, if the backing array has enough items, meaning that
there has previously been a doc with enough values, we let the request go through, and we end up
returning items from the previous doc that had a value at that position if the current doc does not
have enough elements.
We should instead validate the length of the array for the current doc and eventually throw an error
if the index goes over the available number of values.
Closes#104998
Handle the "output memory allocator bytes" field if and only if it is present in the model size stats, as reported by the C++ backend.
This PR must be merged prior to the corresponding ml-cpp one, to keep CI tests happy.
Backports #109653
Currently, we do not register task cancellations for exchange requests,
which leads to a long delay in failing the main request when a data-node
request is rejected.
* Guard file settings readiness on file settings support (#109500)
Consistency of file settings is an important invariant. However, when
upgrading from Elasticsearch versions before file settings existed,
cluster state will not yet have the file settings metadata. If the first
node upgraded is not the master node, new nodes will never become ready
while they wait for file settings metadata to exist.
This commit adds a node feature for file settings to guard waiting on
file settings for readiness. Although file settings has existed since
8.4, the feature is not a historical feature because historical features
are not applied to cluster state that readiness checks. In this case it
is not needed since upgrading from 8.4+ will already contain file
settings metadata.
* fix test
* iter
* Revert "fix test"
This reverts commit 570e16a788.
* cleanup
* remove test from 8.15
* spotless
* add hexstring support byte painless scorers (#109492)
Hexidecimal strings are supported for index input and for kNN queries. We should support them for byte vectors in painless.
This commit addresses this for our common scoring functions.
closes: #109412
* adjust bwc test version
Currently when upgrading a 7.x cluster to 8.x with
`index.mapper.dynamic` index setting defined the following happens:
- In case of a full cluster restart upgrade, then the index setting gets archived and after the upgrade the cluster is in a green health.
- In case of a rolling cluster restart upgrade, then shards of indices with the index setting fail to allocate as nodes start with 8.x version. The result is that the cluster has a red health and the index setting isn't archived. Closing and opening the index should archive the index setting and allocate the shards.
The change is about ensuring the same behavior happens when upgrading a
cluster from 7.x to 8.x with indices that have the
`index.mapper.dynamic` index setting defined. By re-defining the
`index.mapper.dynamic `index setting with
`IndexSettingDeprecatedInV7AndRemovedInV8` property, the index is
allowed to exist in 7.x indices, but can't be defined in new indices
after the upgrade. This way we don't have to rely on index archiving and
upgrading via full cluster restart or rolling restart will yield the
same outcome.
Based on the test in #109301. Relates to #109160 and #96075
This fixes task cancellation actions (i.e. internal:admin/tasks/cancel_child and internal:admin/tasks/ban) not being authorized by the fulfilling cluster. This can result in orphaned tasks on the fulfilling cluster.
Backport of #109357
The new subcommand elasticsearch-node remove-index-settings can be used
to remove index settings from the cluster state in case where it
contains incompatible index settings that prevent the cluster from
forming. This tool can cause data loss and its use should be your last
resort.
Relates #96075
ExceptionHelper#useAndSuppress can throw exceptions if both input
exceptions having the same root cause. If this happens, the field-caps
request dispatcher might fail to notify the completion to the caller. I
found this while running ES|QL with disruptions.
Relates #107347
`ExpandSearchPhase` was leaking `SearchHits` when a pooled `SearchHits`
that was read from the wire was added to an unpooled `SearchHit`.
This commit makes the relevant `SearchHit` instances that need to be
pooled so they released nested hits, pooled. This requires a couple of
smaller adjustments in the codebase, mainly around error handling.
* Handle must_not clauses when disabling the weight matches highlighting mode (#108453)
This change makes sure we check all queries, even the must_not ones, to decide if we should disable weight matches highlighting or not.
Closes#101667Closes#106693
* adapt test skip version
Currently, loading ordinals multiple times (after advanceExact) for
documents with values spread across multiple blocks in the TSDB codec
will fail due to the absence of re-seeking for the ordinals block.
Doc-values of a document can spread across multiple blocks in two cases:
when it has more than 128 values or when it exceeds the remaining space
in the current block.
We disable inter-segment concurrency in the query phase whenever profile is on, because there
are known concurrency issues that need fixing. The way we disable concurrency is by creating a single
slice that search will execute against. We still offload the execution to the search workers thread pool.
Inter-segment concurrency in Lucene is though not always based on slices. Knn query (as well as terms enum loading
and other places) parallelizes across all segments independently of slices that group multiple segments together.
That behavior is not easy to disable unless you don't set the executor to the searcher, in which case though you
entirely disable using the separate executor for potentially heavy CPU/IO based loads which is not desirable.
That means that when executing a knn query, it will execute in parallel (in DFS as well as in the query phase)
no matter if inter-segment concurrency has been disabled because profiling is on. When using pre-filtering,
there are queries like multi term queries that will call createWeight from each segment, in parallel, when
pulling the scorer. That causes non-deterministic behavior as the profiler does not support concurrent access
to some of its data structures.
This commit protects the profiler from concurrent access to its data structures by synchronizing access to its tree.
Performance is not a concern here, as profiler is already known to slow down query execution.
Closes#104235Closes#104131
* ESQL: Fix MV_DEDUPE when using data from an index (#107577)
Correctly label numerical/boolean blocks loaded from indices, so that MV_DEDUPE works correctly.
(cherry picked from commit 70cfe6f016)
# Conflicts:
# server/src/main/java/org/elasticsearch/TransportVersions.java
# x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/plugin/EsqlFeatures.java
* Do not require MvOrdering.SORTED_ASCENDING
Adapt this fix so that we do not have to introduce SORTED_ASCENDING, and
thus do not have to bump the transport version for this to work.
Boolean/numerical doc values are just labelled as UNORDERED, instead,
which is still correct.
We want to validate stats formatting before we serialize to XContent, as chunked x-content serialization
assumes that we don't throw exceptions at that point. It is not necessary to do it in the StreamInput constructor
as this one has been serialise from an already checked object.
This commit adds starts formatting validation to the standard InternalStats constructor.
Similar to tother cases, the addition of a new merge policy that reverse the order of the documents in lucene causes
this test to fail in edge cases. To avoid randomisation we hardcode the merge policy to LogDocMergePolicy.
Today, we have disabled ccs_minimized_round_trips for lookup requests,
under the assumption that cross-cluster lookups occur when
ccs_minimized_round_trips is disabled in the main search request.
However, this assumption does not hold true for cases where the search
is local but the lookup happens remotely.
This PR fixes a bug in the bulk operation when retrying blocked cluster states before
executing a failure store write by correctly wrapping the retry runnable to keep it from
prematurely returning a null response.
This page was split up in #104614 but the `ReferenceDocs` symbol links
to the top-level page still rather than the correct subpage. This fixes
the link.
During the fetch phase, there's a number of stored fields that are requested explicitly or loaded by default. That information is included in `StoredFieldsSpec` that each fetch sub phase exposes.
We attempt to provide stored fields that are already loaded to the fields lookup that scripts as well as value fetchers use to load field values (via `SearchLookup`). This is done in `PreloadedFieldLookupProvider.` The current logic makes available values for fields that have been found, so that scripts or value fetchers that request them don't load them again ad-hoc. What happens though for stored fields that don't have a value for a specific doc, is that they are treated like any other field that was not requested, and loaded again, although they will not be found, which causes overhead.
This change makes available to `PreloadedFieldLookupProvider` the list of required stored fields, so that it can better distinguish between fields that we already attempted to load (although we may not have found a value for them) and those that need to be loaded ad-hoc (for instance because a script is requesting them for the first time).
This is an existing issue, that has become evident as we moved fetching of metadata fields to `FetchFieldsPhase`, that relies on value fetchers, and hence on `SearchLookup`. We end up attempting to load default metadata fields (`_ignored` and `_routing`) twice when they are not present in a document, which makes us call `LeafReader#storedFields` additional times for the same document providing a `SingleFieldVisitor` that will never find a value.
Another existing issue that this PR fixes is for the `FetchFieldsPhase` to extend the `StoredFieldsSpec` that it exposes to include the metadata fields that the phase is now responsible for loading. That results in `_ignored` being included in the output of the debug stored fields section when profiling is enabled. The fact that it was previously missing is an existing bug (it was missing in `StoredFieldLoader#fieldsToLoad`).
Yet another existing issues that this PR fixes is that `_id` has been until now always loaded on demand when requested via fetch fields or script. That is because it is not part of the preloaded stored fields that the fetch phase passes over to the `PreloadedFieldLookupProvider`. That causes overhead as the field has already been loaded, and should not be loaded once again when explicitly requested.
I have a couple heap dumps that show the lock wrapper alone waste O(10M)
in heap for these things. Also, I suspect the indirection does cost
non-trivial performance here in some cases. => lets spend a couple more
lines of code to save that overhead