We already disable inter-segment concurrency in SearchSourceBuilder whenever
the top-level sort provided is not _score. We shoudl apply the same rules
in top_hits. We recenly stumbled upon non deterministic behaviour caused by
script sorting defined within top hits. That is to be expected given that
script sorting does not support search concurrency.
The sort script can be replaced with a runtime field, either defined in the
mapping or in the search request, which does support concurrency and guarantees
predictable behaviour.
When IndicesService is closed, the pending deletion may still be in
progress due to indices removed before IndicesService gets closed. If
the deletion stucks for some reason, it can stall the node shutdown.
This PR aborts the pending deletion more promptly by not retry after
IndicesService is stopped.
Resolves: #121717Resolves: #121716Resolves: #122119
When #119968 was merged into multi-project we introduced a regression by
inserting a call to `.getProject()` within the `RoutingNodes` class that
was supposed to be multi-project-aware.
This commit replaces those calls with `.indexMetadata` lookups
We're catching MP exceptions when applying cluster state listeners to
avoid noise in the logs, but we shouldn't forget to remove this `catch`
block at some point in the future.
These methods will be removed at some point in the future. By marking
them as deprecated, it'll be easier to spot that some code is using
these methods. Additionally, this will hopefully prevent people from
using these temporary methods.
We can do with a whole lot less in ref-counting, avoiding lots of contention and speeding
up the logic in general by only incrementing ref-counts where ownership is unclear while
avoiding count changes on obvious "moves".
The node weight changes between two balancer rounds are summarized by
saving the old DesiredBalance's weights per node along with a weights
diff to reach the new DesiredBalance's weights per node. This supports
combining multiple summaries by using the oldest summary's base node
weights and summing the diffs across all summaries to reach a combined
node weight diffs.
If all we want to know is whether the cluster has any indices, then
this new method is more efficient than using the existing
"getTotalNumberOfIndices" method
This PR is the follow-up work for MP-1945 and MP-1938 which laid the
foundation of two different scoped persistent tasks. It updates the
persistent task framework to be aware of the two task types so that it
can handle both cluster scope tasks and per-project tasks. Once these
changes are in place, we will make health-node to be the first
cluster-scope persistent task.
Relates: ES-10168
The NamedComponentReader reads a file created upon plugin installation
for stable plugins from the plugin installation dir. This commit passes
the plugins directory through to entitlements and grants server access.
Rather than checking the license (updating the usage map) on every
single shard, just do it once at the start of a computation that needs
to forecast write loads.
Closes#123247
* Consider entitlement lib as system module
Entitlements sometimes needs to perform sensitive operations,
particularly within the FileAccessTree. This commit expands the
trivially allowed check to include entitlements as one of the system
modules alongside the jdk. One consequence is that the self test must be
moved outside entitlements.
* [CI] Auto commit changes from spotless
* remove old method call
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Versions 9.1.0 onwards will not be wire-compatible with versions before
8.19.0. This commit sets the minimum transport version to reject
handshakes from earlier versions.
An old node writes all tasks in the metadata custom. A new old must be
able to read it and separate the cluster-scoped and project-scoped tasks
and store them in the right place. This PR does that.
Relates: MP-1945, MP-1938
* Updating error message when index field type is unknown
* Fix style issue
* Add yaml test for invalid field type error message
* Update docs/changelog/122860.yaml
* Updating error message for runtime and multi field type parser
* add and fix yaml tests
* Fix code styles by running spotlessApply
* Update changelog
* Updatig the test in yml
* Updating error message for runtime
* Fix failing yaml tests
* Update error message to Fix unit tests
* fix serverless qa test
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
When fetching documents, sometimes we need to load the entire source of
search hits. Document sources can be large, and with support for up to
10k hits per search request, this creates a significant untracked
memory load on Elasticsearch that can potentially cause out-of-memory
errors.
This PR adds memory checking for hits source in the fetch phase. We
check with the parent (the real memory) circuit breaker every 1MiB of
loaded source and when fetching the last document of every segment. This
gives the real memory breaker a chance to interrupt running operations
when we're running low on memory, and prevent potential OOMs.
The amount of local accounting to buffer is controlled by the
`search.memory_accounting_buffer_size` dynamic setting and defaults to
`1MiB`.
Fixes#89656
Fixes https://github.com/elastic/elasticsearch/issues/122588
- Replaced `Source.EMPTY.writeTo(out)` to `source().writeTo(out)` in functions emitting warnings
- Did the same on all aggs, as Top emits an error on type resolution. This is not a bug, as type resolution errors should only happen in the coordinator. Another option would be changing Top to not generate that error there, and make it implement instead `PostAnalysisVerificationAware`
- In some cases, we don't even serialize an empty source. So I had to add a new `TransportVersion` to do so
- As an special case, `ToLower` and `ToUpper` weren't serializing a source, but they don't emit warnings. As they were the only remaining functions not serializing the source, I added it there too
We only need the extensibility for testing and it's a lot easier to
reason about the code if we have explicit methods instead of overly
complicated composition with lots of redundant references being retained
all over the place.
-> lets simplify to inheritance and get shorter code that performs more
predictably (especially when it comes to memory) as a first step.
This also opens up the possibility of further simplifications and
removing more retained state/memory as we go through the search phases.
These things can be quite expensive and there's no need to recompute
them in parallel across all management threads as done today. This
commit adds a deduplicator to avoid redundant work.
No need to have a nested concat here. There's obviously lots and lots of
room for optimization on this one, but just flattening out one obvious
step here outright halves the number of method calls required when
serializing a search response. Given that method calls can consume up to
half the serialization cost this change might massively speed up some
usecases.
Checking whether we need to refresh does not require a searcher
so we can simplify this to just work based on the reader
and avoid lots of contention etc. for setting up the searcher.
relates #122374
Cleans up a couple things that are obviously broken:
* duplicate array and object constructs where the helper utility
generates the exact same iterator
* unused helper methods
* single iterator concatenations