Today `ThreadPool#scheduleWithFixedDelay` does not interact as expected
with `AbstractRunnable`: if the task fails or is rejected then this
isn't passed back to the relevant callback, and the task cannot specify
that it should be force-executed. This commit fixes that.
If the search threadpool fills up then we may reject execution of
`SearchService.Reaper` which means it stops retrying. We must instead
force its execution so that it keeps on going.
With #106542, closes#106543
Some index requests target shard IDs specifically, which may not match the indices that the request targets as given by `IndicesRequest#indices()`, which requires a different interception strategy in order to make sure those requests are handled correctly in all cases and that any malformed messages are caught early to aid in troubleshooting.
This PR adds and interface allowing requests to report the shard IDs they target as well as the index names, and adjusts the interception of those requests as appropriate to handle those shard IDs in the cases where they are relevant.
The tests for loading `Block`s from scripted fields could fail randomly
when the `RandomIndexWriter` shuffles the documents. This disables
merging and adds the documents as a block so their order is consistent.
Closes#106044
* Reset job if existing reset fails (#106020)
* Try again to reset a job if waiting for completion of an existing reset task fails.
* Update docs/changelog/106020.yaml
* Update 106020.yaml
* Update docs/changelog/106020.yaml
* Improve code
* Trigger rebuild
When using a pre-filter with nested kNN vectors, its treated like a
top-level filter. Meaning, it is applied over parent document fields.
However, there are times when a query filter is applied that may or may
not match internal nested or non-nested docs. We failed to handle this
case correctly and users would receive an error.
closes: https://github.com/elastic/elasticsearch/issues/105901
First check whether the full cluster supports a specific indicator (feature) before we mark an indicator as "unknown" when (meta) data is missing from the cluster state.
Submitting a task during shutdown is highly unreliable and in almost all cases the task
will be rejected (removed) anyways. Not forcing execution if the executor is already
shutting down leads to more deterministic behavior and fixes
EsExecutorsTests.testFixedBoundedRejectOnShutdown.
(cherry picked from commit 954c428cde)
We could still be manipulating a network message when the event loop
shuts down, causing us to close the message while it's still in use.
This is at best going to be a little surprising to the caller, and at
worst could be an outright use-after-free bug.
This commit moves the double-check for a leaked promise to happen
strictly after the event loop has fully terminated, so that we can be
sure we've finished using it by this point.
Relates #105306, #97301
This change enables the following logging for the test:
* refreshed cluster info to ensure allocator is seeing correct data
* allocator trace logging to check the balance computation is correct
* reconciler debug logging to check if there is anything unexpected during reconciliation
We have various automaton based queries that build particular automatons
based on their usage. However, the input text isn't part of the
`toString` output, nor the usage of the current query (wildcard,
prefix,etc.).
This commit adds a couple of simple queries to wrap some of our logic to
make profiling and other output more readable.
Here is an example without this change:
```
#(-(winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2d13c057} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@28daf002} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@43c3d7f8} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2f52905} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@31d75074})
```
We have 5 case-insensitive automatons, but we don't know which is which
in the profiling output. All we know is the originating field.
I don't think we can update `AutomatonQuery` directly as sometimes the
automaton created mutates the term (prefix for example) and we lose that
we are searching for a prefix.
When an array is passed to Objects.hash() it needs to be wrapped with Arrays.hashCode() for calculating the hash of the array content rather than using the array instance "identity hash code"
Similar to #99392, #97879 etc, no need to have the
`NodePersistentTasksExecutor` look up the executor to use each time, nor
does it necessarily need to use a named executor from the `ThreadPool`.
This commit pulls the lookup earlier in initialization so we can just
use a bare `Executor` instead.
We see errors that we believe this is happening because `es` is already
stopping but the periodic health logger keeps querying the the health
API. Since the `es` stopping we believe it makes sense to also stop the
periodic health logger.
Furthermore, we make the close method more respectful to the execution
of the periodic health logger which will wait for the last run to finish
if it's still in progress.
This PR makes the `HealthPeriodicLogger` lifecycle aware and uses a
semaphore to block the `close()` method.
When configuring an OpenAI text embedding service the `model_id` should
have always been part of the service settings rather than task settings.
Task settings are overridable, service settings cannot be changed. If
different models are used the configured entities are considered
distinct.
task_settings is now optional as it contains a single optional field
(`user`)
```
PUT _inference/text_embedding/openai_embeddings
{
"service": "openai",
"service_settings": {
"api_key": "XXX",
"model_id": "text-embedding-ada-002"
}
}
```
Backwards compatibility with previously configured models is maintained
by moving the `model_id` (or `model`) from task settings to service
settings at the first stage of parsing. New configurations are persisted
with `model_id` in service settings, old configurations with `model_id`
in task settings are not modified and will be tolerated by a lenient
parser.
This change adds additional plumbing to pipe through the available cluster features into
SearchSourceBuilder. A number of different APIs use SearchSourceBuilder so they had to make this
available through their parsers as well often through ParserContext. This change is largely mechanical
passing a Predicate into existing REST actions to check for feature availability.
Note that this change was pulled mostly from this PR (#105040).
Today this test suite relies on being able to cancel an in-flight
publication after it's reached a committed state. This is questionable,
and also a little flaky in the presence of the desired balance allocator
which may introduce a short delay before enqueuing the cluster state
update that performs the reconciliation step.
This commit removes the questionable meddling with the internals of
`Coordinator` and instead just blocks the cluster state updates at the
transport layer to achieve the same effect.
Closes#102947
* Use single-char variant of String.indexOf() where possible
indexOf(char) is more efficient than searching for the same one-character String.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
GenerateUniqueIndexNameStep contained the exact copies of the generateValidIndexName() and generateValidIndexSuffix() methods from the IndexNameGenerator utility class.
I removed the duplicates and changed the code to use the utility method instead.
Also added javadoc and switched to a pre-compiled Pattern.
The test was also broken as it checked the suffix to consist of only illegal characters.
Replacing matches() with find() makes it check for presence of at least one illegal character.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
The index needs to be in tsdb mode. All fields will use the tsdb coded, except fields start with a _ (not excluding _tsid).
Before this change relies on MapperService to check whether a field needed to use tsdb doc values codec, but we missed many field types (ip field type, scaled float field type, unsigned long field type, etc.). Instead we wanted to depend on the doc values type in FieldInfo, but that information is not available in PerFieldMapperCodec.
Borrowed the binary doc values implementation from Lucene90DocValuesFormat. This allows it to be used for any doc values field.
Followup on #99747
* Use String.replace() instead of replaceAll() for non-regexp replacements
When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.