The LimitOperator can be finished early because the Limiter may be
shared across operators. This PR relaxes the assertion to allow for
this.
Closes#130228Closes#130219
This does _not_ make the health indicator project-aware, it merely
avoids exceptions in case there are multiple projects in the cluster.
The health indicator would require significant refactoring to be made
project-aware, which is not worth it since ILM will not be running in a
multi-project context (i.e. serverless).
This brings in the fixes from #130020, with minor fixes to address review
nits from that PR.
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
* Use `throwInvalidIndexNameException()` to throw invalid ex after
dropping asterisk in `IdentifierBuilder#resolveAndValidateIndex()`
* Assert the message in test
* Refactor
* drop invalid chars from assertion string due to randomisation issue
* Re-assert invalid chars
* Update docs/changelog/130027.yaml
* Specific idx in testMultipleBatchesWithLookupJoin
Do not use `from *` in the test, be more specifc - otherwise other tests
can affect the output if they leave indices behind, affecting the column
count.
* Remove column count validation
That doesn't really tell us much in this test, anyway.
Decreases the periodicity (from 5s to 100ms), for some tests, with which merge tasks are checked if there's sufficient available disk space to execute or if they've been aborted since enqueued for execution.
Fixes#130044
Prepares the `main` branch for the backport of #125631. Specifically,
this adds the version constant for 8.19 to main and the serialization
code that lets main talk to 8.19.
There was a bug in the code for deleting unused and orphan ML data. When deletion using DBQ occurred, the bug caused the request to time out. This PR resolves the issue.
Initial version of patterned_text mapper. Behaves similarly to match_only_text. This version uses a single SortedSetDocValues for a template and another for arguments. It splits the message by delimiters, the classifies a token as an argument if it contains a digit. All arguments are concatenated and inserted as a single doc value. A single inverted index is used, without positions. Phrase queries are still possible, using the SourceConfirmedTextQuery, but are not fast.
Keep better track of shard contexts using RefCounted, so they can be released more aggressively during operator processing. For example, during TopN, we can potentially release some contexts if they don't pass the limit filter.
This is done in preparation of TopN fetch optimization, which will delay the fetching of additional columns to the data node coordinator, instead of doing it in each individual worker, thereby reducing IO. Since the node coordinator would need to maintain the shard contexts for a potentially longer duration, it is important we try to release what we can eariler.
An even more advanced optimization is to delay fetching to the main cluster coordinator, but that would be more involved, since we need to first figure out how to transport the shard contexts between nodes.
Summary of main changes:
DocVector now maintains a RefCounted instance per shard.
Things which can build or release DocVectors (e.g., LuceneSourceOperator, TopNOperator), can also hold RefCounted instances, so they can pass them to DocVector and also ensure contexts aren't released if they can still be potentially used later.
Driver's main loop iteration (runSingleLoopIteration), now closes its operators even between different operator processing. This is extra aggressive, and was mostly done to improve testability.
Added a couple of tests to TopNOperator and a new integration test EsqlTopNShardManagementIT, which uses the pausable plugin framework to check that TopNOperator releases things as early as possible..
The local plan optimizer should not change the layout, as it has already
been agreed upon. However, CombineProjections can violate this when some
grouping elements refer to the same attribute. This occurs when
ReplaceFieldWithConstantOrNull replaces missing fields with the same
reference for a given data type.
Closes#128054Closes#129811
ES|QL index patterns validation: Ensure that the patterns in the query are syntactically and semantically valid
---------
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
Co-authored-by: Alexander Spies <alexander.spies@elastic.co>
Before a `KeywordFieldType` was created that didn't set the isSyntheticSource field, causing to use the wrong block loader that would synthesize the complete _source instead of just loading values from ignored source. This PR addresses this.
With the introduction of entitlements (#120243) and exclusive file
access (#123087) it is no longer safe to watch a whole directory.
In a lot of deployments, the parent directory for SSL config files
will be the main config directory, which also contains exclusive files
such as SAML realm metadata or File realm users. Watching that
directory will cause entitlement warnings because it is not
permissible for core/ssl-config to read files that are exclusively
owned by the security module (or other modules)
This PR makes RepositoriesService project aware so that the basic Put,
Get, Delete and Verify repository actions are now project scoped.
It intentionally leaves the following aspects out of scope for the
current changes: * Repository stats reporting * Repository clean-up,
analysis and integrity verification * Repository usages for searchable
snapshots and CCR
They will be worked on separately. One main reason for leaving them out
is that they are not needed by OBS which is currently blocked by
repository/snapshot changes. They may also have their own complexities,
e.g. stats reporting.
Resolves: ES-10478
Introduces a new `RemoveBlock` API that complements the existing `AddBlock` API by allowing users to remove index blocks using `DELETE /{index}/_block/{block}`.
Resolves#128966
---------
Co-authored-by: Niels Bauman <nielsbauman@gmail.com>
The #127318 changed the behaviour of `client()` to not start a node if there
is none found in the cluster. Which also changed the `getMasterName()`
behaviour to simply fail if there are no nodes instead of starting one.
This is why the `getMasterName()` is failing now. There were no nodes
started because the test scope is set to manually manage master nodes
(`autoManageMasterNodes = false`) without data nodes (`numDataNodes = 0`).
The fix is to actually start the master node instead of attempting to get
the master node name from an empty cluster and depend on a side effect
to actually boostrap a node.
Additionally it awaits for the master node to process all cluster state events
before proceeding, which should hopefully solve the original cause of failures.
Resolves#120964Resolves#120923