Some assertions with floating-point values are failing on serverless
because we run tests with three shards with serverless. This can cause
variations in precision because data may arrive in different orders. For
example, sum([a, b, c]) can yield a different precision than sum([a, c,
b]). This change introduces tolerance for precision differences beyond
e-10, which should be acceptable for ESQL.
There were some optimizations that broke collapse fields automatically
being added to `docvalue_fields` during the fetch phase.
Consequently, users will get really weird errors like
`unsupported_operation_exception`. This commit corrects the intended
behavior of automatically including the collapse field in the
docvalue_fields context during fetch if it isn't already included.
closes: https://github.com/elastic/elasticsearch/issues/96510
PR #99445 introduced automatic normalization of dense vectors with
cosine similarity. This adds a note about this in the documentation.
Relates to #99445
* (Doc+) Delineate Bootstrapping Data Stream from Alias
👋 howdy, team!
This is follow-up to [elasticsearch#107327](https://github.com/elastic/elasticsearch/pull/107327). I realized my mistake was that we had duplicate sentences in different sections so I edited the wrong area. However, it seemed like a good opportunity to consider clarifying the page better by fixing header links so that the sub-sections reflect as sub-headers instead of all being equal. Thoughts?
* Apply suggestions from code review
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
---------
Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
(cherry picked from commit a3f3f59399)
Co-authored-by: Stef Nestor <26751266+stefnestor@users.noreply.github.com>
Let’s say we have `my-metrics` data stream which is receiving a lot of
indexing requests. The following scenario can result in multiple
unnecessary rollovers:
1. We update the mapping and mark it to be lazy rolled over
2. We receive 5 bulk index requests that all contain a write request for this data stream.
3. Each of these requests are being picked up “at the same time”, they see that the data stream needs to be rolled over and they issue a lazy rollover request.
4. Currently, data stream my-metrics has 5 tasks executing an unconditional rollover.
5. The data stream gets rolled over 5 times instead of one.
This scenario is captured in the `LazyRolloverDuringDisruptionIT`.
We have witnessed this also in the wild, where a data stream was rolled
over 281 times extra resulting in 281 empty indices.
This PR proposes:
- To create a new task queue with a more efficient executor that further batches/deduplicates the requests.
- We add two safe guards, the first to ensure we will not enqueue the rollover task if we see that a rollover has occurred already. The second safe guard is during task execution, if we see that the data stream does not have the `rolloverOnWrite` flag set to `true` we skip the rollover.
- When we skip the rollover we return the following response:
```
{
"acknowledged": true,
"shards_acknowledged": true,
"old_index": ".ds-my-data-stream-2099.05.07-000002",
"new_index": ".ds-my-data-stream-2099.05.07-000002",
"rolled_over": false,
"dry_run": false,
"lazy": false,
}
```
* Fix `TasksIT#testTasksCancellation` (#109929)
The tasks are removed from the task manager _after_ sending the
response, so we cannot reliably assert they're done. With this commit we
wait for them to complete properly first.
Closes#109686
* Introduce safeGet
* ESQL: Warn about division (#109716)
When you divide two integers or two longs we round towards 0. Like
Postgres or Java or Rust or C. Other systems, like MySQL or SPL or
Javascript or Python always produce a floating point number. We should
warn folks about this. It's genuinely unexpected for some folks. OTOH,
converting into a floating point number would be unexpected for other
folks. Oh well, let's document what we've got.
* Fix link for 8.14
When accessing array elements from a script, if the backing array has enough items, meaning that
there has previously been a doc with enough values, we let the request go through, and we end up
returning items from the previous doc that had a value at that position if the current doc does not
have enough elements.
We should instead validate the length of the array for the current doc and eventually throw an error
if the index goes over the available number of values.
Closes#104998
Handle the "output memory allocator bytes" field if and only if it is present in the model size stats, as reported by the C++ backend.
This PR must be merged prior to the corresponding ml-cpp one, to keep CI tests happy.
Backports #109653
This PR fixes two test failures
https://github.com/elastic/elasticsearch/issues/103981 &
https://github.com/elastic/elasticsearch/issues/105437 and refactors the
code a bit to make things more explicit.
**What was the issue** These tests were creating an index with a policy
before that policy was created. This could cause an issue if ILM would
run after the index was created but before the policy was created.
When ILM runs before the policy is added, the following happen:
- the index encounters an error the ILM state sets that the current step is `null`, which makes sense since there is no policy to retrieve a step from.
- A `null` step does not qualify to be executed periodically, which also makes sense because probably nothing changed, so chances are the index will remain in this state.
- The test keeps waiting for something to happen, but this is not happening because no cluster state updates are coming like they would have if this was a "real" cluster.
- Until the test tear down starts, then the index gets updates with the ILM policy but it's a bit too late.
The previous scenario is confirmed by the logging too.
```
----> The index gets created referring a policy that does not exist yet, ILM runs at least twice before the policy is there
[2024-06-12T20:14:28,857][....] [index-sanohmhwxl] creating index, ......
[2024-06-12T20:14:28,870][....] [index-sanohmhwxl] retrieved current step key: null
[2024-06-12T20:14:28,871][....] unable to retrieve policy [policy-tohpA] for index [index-sanohmhwxl], recording this in step_info for this index java.lang.IllegalArgumentException: policy [policy-tohpA] does not exist
-----> Only now the policy is added
[2024-06-12T20:14:29,024][....] adding index lifecycle policy [policy-tohpA]
-----> ILM is running periodically but because the current step is null it ignores it
[2024-06-12T20:15:23,791][....] job triggered: ilm, 1718223323790, 1718223323790
[2024-06-12T20:15:23,791][....] retrieved current step key: null
[2024-06-12T20:15:23,791][....] maybe running periodic step (InitializePolicyContextStep) with current step {"phase":"new","action":"init","name":"init"}
```
This can also be locally reproduced by adding a 5s thread sleep before
adding the policy.
**The fix** Adding a non existing policy to an index is a not a
supported path. For this reason, we refactored the test to reflect a
more realistic scenario.
- We add the policy as an argument in `private void createIndex(String index, String alias, String policy, boolean isTimeSeries)`. This way it's clear that a policy could be added.
- We created the policy before adding the index, it does not appear that adding the policy later is crucial for the test, so simplifying it sounded like a good idea.
- Simplified `testRollupIndexInTheHotPhaseWithoutRollover` that ensures that a downsampling action cannot be added in the hot phase without rollover. An index is not necessary for this test, so again simplifying it makes the purpose of the test more clear.
Fixes: https://github.com/elastic/elasticsearch/issues/103981 Fixes:
https://github.com/elastic/elasticsearch/issues/105437
Currently, we do not register task cancellations for exchange requests,
which leads to a long delay in failing the main request when a data-node
request is rejected.