We have instances where BWC tests configure old ES version nodes with
the integTest distribution. This isn't a valid configuration, and while
we in reality resolve the default distribution artifact, we have other
configuration logic that behaves differently based on whether the
integTest distro was _requested_. Specifically, what to set ES_JAVA_HOME
to. This bug resulted in us attempting to run old nodes using the
current bundled JDK version, which may be incompatible with that older
version of Elasticsearch.
Closes#104858
Today `ThreadPool#scheduleWithFixedDelay` does not interact as expected
with `AbstractRunnable`: if the task fails or is rejected then this
isn't passed back to the relevant callback, and the task cannot specify
that it should be force-executed. This commit fixes that.
If the search threadpool fills up then we may reject execution of
`SearchService.Reaper` which means it stops retrying. We must instead
force its execution so that it keeps on going.
With #106542, closes#106543
If we proceed without waiting for pages, we might cancel the main
request before starting the data-node request. As a result, the exchange
sinks on data-nodes won't be removed until the inactive_timeout elapses,
which is longer than the assertBusy timeout.
Closes#106443
* (DOC+) Version API page for ES API Base URL (#105845)
* (DOC+) Version API page for ES API Base URL
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
(cherry picked from commit 157ce539aa)
* Fix URL syntax, copy nit
---------
Co-authored-by: Stef Nestor <26751266+stefnestor@users.noreply.github.com>
I investigated a heap attack test failure and found that an ESQL request
was stuck. This occurred in the following:
1. The ExchangeSource on the coordinator was blocked on reading because
there were no available pages.
2. Meanwhile, the ExchangeSink on the data node had pages ready for
fetching.
3. When an exchange request tried to fetch pages, it failed due to a
CircuitBreakingException. Despite the failure, no cancellation was
triggered because the status of the ExchangeSource on the coordinator
remained unchanged. To fix this issue, this PR introduces two changes:
Resumes the ExchangeSourceOperator and Driver on the coordinator,
eventually allowing the coordinator to trigger cancellation of the
request when failing to fetch pages.
Ensures that an exchange sink on the data nodes fails when a data node
request is cancelled. This callback was inadvertently omitted when
introducing the node-level reduction in Run empty reduction node level
on data nodes #106204.
I plan to spend some time to harden the exchange and compute service.
Closes#106262
The tests that assert sorting on spatial types causes consistent error messages, also were flaky for the non-error message cases under rare circumstances where the results were returned in different order. We now sort those on a sortable field for deterministic behaviour.
The docs here are a little inaccurate, and link to several individual
settings (incorrectly in some cases) in a paragraph that's pretty hard
to read. This commit fixes the inaccuracies and replaces the links to
individual settings with one to all the docs about the disk-based shard
allocator.
Some index requests target shard IDs specifically, which may not match the indices that the request targets as given by `IndicesRequest#indices()`, which requires a different interception strategy in order to make sure those requests are handled correctly in all cases and that any malformed messages are caught early to aid in troubleshooting.
This PR adds and interface allowing requests to report the shard IDs they target as well as the index names, and adjusts the interception of those requests as appropriate to handle those shard IDs in the cases where they are relevant.
* Fix error on sorting unsortable geo_point and cartesian_point
Without a LIMIT the correct error worked, but with LIMIT it did not. This fix mimics the same error with LIMIT and adds tests for all three scenarios:
* Without limit
* With Limit
* From row with limit
* Update docs/changelog/106351.yaml
* Add tests for geo_shape and cartesian_shape also
* Updated changelog
* Separate point and shape error messages
* Move error to later so we get it only if geo field is actually used in sort.
* Implemented planner check in Verifier instead
This is a much better solution.
* Revert previous solution
* Also check non-field attributes so the same error is provided for ROW
* Changed "can't" to "cannot"
* Add unit tests for verifier error
* Added sort limitations to documentation
* Added unit tests for spatial fields in VerifierTests
* Don't run the new yaml tests on older versions
These tests mostly test the validation errors which were changed only in 8.14.0, so should not be tested in earlier versions.
* Simplify check based on code review, skip duplicate forEachDown
* Add regression tests that test ACS and entity id mismatch, causing
us to go into the initCause branch
* Fix up exception creation: initCause it not
allowed because ElasticsearchException
initialises the cause to `null` already if
it isn't passed as a contructor param.
Signed-off-by: lloydmeta <lloydmeta@gmail.com>
After tsid hashing was introduced (#98023), the time series aggregator generates the tsid (from all dimension fields) instead of using the value from the _tsid field directly. This generation of the tsid happens for every time serie, parent bucket and segment combination.
This changes alters that by only generating the tsid once per time serie and segment. This is done by just locally recording the current tsid.
`look_ahead_time` is set to 1 minute, the `assertBusy` loop needs to
wait for longer than that to get a readonly backing index.
Note that this is only relevant when the `UpdateTimeSeriesRangeService`
kicks in to bump the end time of the head index. This is rare (it runs
every 10 minutes) but can happen.
Fixes#101428
Today we do not say explicitly that `integer` response fields are really
arbitrarily large JSON integers and may not fit into a Java `int`. This
commit expands the docs to add this information.