It seems that the changes of https://github.com/elastic/ml-cpp/pull/2585
combined with the randomness of the test could cause it to fail
very occasionally, and by a tiny percentage over the expected
upper bound. This change reenables the test by very slightly
increasing the upper bound.
Fixes#105347
Submitting a task during shutdown is highly unreliable and in almost all cases the task
will be rejected (removed) anyways. Not forcing execution if the executor is already
shutting down leads to more deterministic behavior and fixes
EsExecutorsTests.testFixedBoundedRejectOnShutdown.
(cherry picked from commit 954c428cde)
Currently, there is a small chance that testStopAtCheckpoint will fail
to correctly count the amount of times `doSaveState` is invoked:
```
Expected: <5>
but: was <4>
```
There are two potential issues:
1. The test thread starts the Transform thread, which starts a Search
thread. If the Search thread starts reading from the
`saveStateListeners` while the test thread writes to the
`saveStateListeners`, then there is a chance our testing logic will
not be able to count the number of times we read from
`saveStateListeners`.
2. The non-volatile integer may be read as one value and written as
another value.
Two fixes:
1. The test thread blocks the Transform thread until after the test
thread writes all the listeners. The subsequent test will
continue to verify that we can safely interlace reading and
writing.
2. The counter is now an AtomicInteger to provide thread safety.
Fixes#90549
Today a node with a registered `URLRepository` will not shut down
cleanly because it never releases the last of the `activityRefs`. This
commit fixes that.
We could still be manipulating a network message when the event loop
shuts down, causing us to close the message while it's still in use.
This is at best going to be a little surprising to the caller, and at
worst could be an outright use-after-free bug.
This commit moves the double-check for a leaked promise to happen
strictly after the event loop has fully terminated, so that we can be
sure we've finished using it by this point.
Relates #105306, #97301
* Fix bug in rule_query where text_expansion errored because it was not rewritten (#105365)
* Update 260_rule_query_search.yml
Update test skip version
This commit updates the documentation for FIPS support.
In addition to the changes for 8.x it also provides more details for how to setup/configure FIPS mode.
Running heap-attack tests with multiple nodes can still lead to OOM
errors. This is because the transport response messages are not tracked
by the circuit breaker. In heap attack tests, pages can be very large
(30MB — I will chunk them later), and for each exchange, we use three
concurrent channels, resulting in 100MB of untracked memory. This pull
request reserves extra bytes for exchange messages. Although this check
doesn't fully prevent OOM errors, it makes them unlikely in such cases.
We should unwrap TransportException errors; otherwise, we can return
them to the caller instead of the actual underlying cause. This becomes
important when the underlying cause is a 4xx error, while
TransportException is a 5xx error. I found this when running the
heap-attack tests
This change enables the following logging for the test:
* refreshed cluster info to ensure allocator is seeing correct data
* allocator trace logging to check the balance computation is correct
* reconciler debug logging to check if there is anything unexpected during reconciliation
This PR adds a new `exclude_roles` setting for SAML realm.
This setting allows to exclude certain roles from being mapped
to users that are authenticated via SAML realm - regardless of
the configured role mappings.
The `exclude_roles` setting supports only explicit role names.
Regular expressions and wildcards are not supported.
The exclusion is possible only if the role mapping is handled
by the SAML realm. Hence, it is not possible to configure it
along with `authorization_realms` setting.
Note: It is intentional that this setting is not registered in this PR.
The registration will be addressed in a separate PR.
We have various automaton based queries that build particular automatons
based on their usage. However, the input text isn't part of the
`toString` output, nor the usage of the current query (wildcard,
prefix,etc.).
This commit adds a couple of simple queries to wrap some of our logic to
make profiling and other output more readable.
Here is an example without this change:
```
#(-(winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2d13c057} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@28daf002} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@43c3d7f8} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2f52905} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@31d75074})
```
We have 5 case-insensitive automatons, but we don't know which is which
in the profiling output. All we know is the originating field.
I don't think we can update `AutomatonQuery` directly as sometimes the
automaton created mutates the term (prefix for example) and we lose that
we are searching for a prefix.
* Move the expression translators into their own, dedicated class.
* Replace TrivialBinaryComparison and
ExpressionTranslators.BinaryComparisons by a single translator
handler.
The wording is a little awkward, and it'd be more helpful to know
_which_ snapshot it was that failed. Also we can use `map` rather than
`delegateFailure` here.
When an array is passed to Objects.hash() it needs to be wrapped with Arrays.hashCode() for calculating the hash of the array content rather than using the array instance "identity hash code"
Similar to #99392, #97879 etc, no need to have the
`NodePersistentTasksExecutor` look up the executor to use each time, nor
does it necessarily need to use a named executor from the `ThreadPool`.
This commit pulls the lookup earlier in initialization so we can just
use a bare `Executor` instead.
We see errors that we believe this is happening because `es` is already
stopping but the periodic health logger keeps querying the the health
API. Since the `es` stopping we believe it makes sense to also stop the
periodic health logger.
Furthermore, we make the close method more respectful to the execution
of the periodic health logger which will wait for the last run to finish
if it's still in progress.
This PR makes the `HealthPeriodicLogger` lifecycle aware and uses a
semaphore to block the `close()` method.