We already disable inter-segment concurrency in SearchSourceBuilder whenever
the top-level sort provided is not _score. We shoudl apply the same rules
in top_hits. We recenly stumbled upon non deterministic behaviour caused by
script sorting defined within top hits. That is to be expected given that
script sorting does not support search concurrency.
The sort script can be replaced with a runtime field, either defined in the
mapping or in the search request, which does support concurrency and guarantees
predictable behaviour.
When IndicesService is closed, the pending deletion may still be in
progress due to indices removed before IndicesService gets closed. If
the deletion stucks for some reason, it can stall the node shutdown.
This PR aborts the pending deletion more promptly by not retry after
IndicesService is stopped.
Resolves: #121717Resolves: #121716Resolves: #122119
Fixes https://github.com/elastic/elasticsearch/issues/123430
There were 2 problems here:
- We were filling a static field (used to auto-cast string literals) within a constructor, which is also called in multiple places
- The field was only filled with non-snapshot functions, so snapshot function auto-casting wasn't possible
Fixed both bugs by making the field non-static instead, and a fix to use the snapshot registry (if available) in the string casting rule.
Rather than checking the license (updating the usage map) on every
single shard, just do it once at the start of a computation that needs
to forecast write loads.
Closes#123247
The original work at https://github.com/elastic/elasticsearch/pull/106065 did not support geospatial types with this comment:
> I made this work for everything but geo_point and cartesian_point because I'm not 100% sure how to integrate with those. We can grab those in a follow up.
The geospatial types should be possible to collect using the VALUES aggregation with similar behavior to the `ST_COLLECT` OGC function, based on the Elasticsearch convention that treats multi-value geospatial fields as behaving similarly to any geometry collection. So this implementation is a trivial addition to the existing values types support.
* Updating error message when index field type is unknown
* Fix style issue
* Add yaml test for invalid field type error message
* Update docs/changelog/122860.yaml
* Updating error message for runtime and multi field type parser
* add and fix yaml tests
* Fix code styles by running spotlessApply
* Update changelog
* Updatig the test in yml
* Updating error message for runtime
* Fix failing yaml tests
* Update error message to Fix unit tests
* fix serverless qa test
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
The LuceneSourceOperator is supposed to terminate when it reaches the
limit; unfortunately, we don't have a test to cover this. Due to this
bug, we continue scanning all segments, even though we discard the
results as the limit was reached. This can cause performance issues for
simple queries like FROM .. | LIMIT 10, when Lucene indices are on the
warm or cold tier. I will submit a follow-up PR to ensure we only
collect up to the limit across multiple drivers.
When fetching documents, sometimes we need to load the entire source of
search hits. Document sources can be large, and with support for up to
10k hits per search request, this creates a significant untracked
memory load on Elasticsearch that can potentially cause out-of-memory
errors.
This PR adds memory checking for hits source in the fetch phase. We
check with the parent (the real memory) circuit breaker every 1MiB of
loaded source and when fetching the last document of every segment. This
gives the real memory breaker a chance to interrupt running operations
when we're running low on memory, and prevent potential OOMs.
The amount of local accounting to buffer is controlled by the
`search.memory_accounting_buffer_size` dynamic setting and defaults to
`1MiB`.
Fixes#89656
Fixes https://github.com/elastic/elasticsearch/issues/122588
- Replaced `Source.EMPTY.writeTo(out)` to `source().writeTo(out)` in functions emitting warnings
- Did the same on all aggs, as Top emits an error on type resolution. This is not a bug, as type resolution errors should only happen in the coordinator. Another option would be changing Top to not generate that error there, and make it implement instead `PostAnalysisVerificationAware`
- In some cases, we don't even serialize an empty source. So I had to add a new `TransportVersion` to do so
- As an special case, `ToLower` and `ToUpper` weren't serializing a source, but they don't emit warnings. As they were the only remaining functions not serializing the source, I added it there too
These things can be quite expensive and there's no need to recompute
them in parallel across all management threads as done today. This
commit adds a deduplicator to avoid redundant work.
Speeds up the VALUES agg when collecting from many buckets.
Specifically, this speeds up the algorithm used to `finish` the
aggregation. Most specifically, this makes the algorithm more tollerant
to large numbers of groups being collected. The old algorithm was
`O(n^2)` with the number of groups. The new one is `O(n)`
```
(groups)
1 219.683 ± 1.069 -> 223.477 ± 1.990 ms/op
1000 426.323 ± 75.963 -> 463.670 ± 7.275 ms/op
100000 36690.871 ± 4656.350 -> 7800.332 ± 2775.869 ms/op
200000 89422.113 ± 2972.606 -> 21920.288 ± 3427.962 ms/op
400000 timed out at 10 minutes -> 40051.524 ± 2011.706 ms/op
```
The `1` group version was not changed at all. That's just noise in the
measurement. The small bump in the `1000` case is almost certainly worth
it and real. The huge drop in the `100000` case is quite real.
This method gets called from `InternalEngine#resolveDocVersion(...)`, which gets during indexing (via `InternalEngine.index(...)`).
When `InternalEngine.index(...)` gets invoked, the InternalEngine only ensures that it holds a ref to the engine via Engine#acquireEnsureOpenRef(), but this doesn't ensure whether it holds a reference to the store.
Closes#122974
* Update docs/changelog/123010.yaml
* Add enterprise license check to inference action for semantic text fields
* Update docs/changelog/122293.yaml
* Set license to trial in ShardBulkInferenceActionFilterIT
* Move license check to only block semantic_text fields that require inference call
* Cleaning up tests
* Add parameterization on useLegacyFormat back in ShardBulkInferenceActionFilterBasicLicenseIT
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
* Allow data stream reindex tasks to be re-run after completion
* Docs update
* Update docs/reference/migration/apis/data-stream-reindex.asciidoc
Co-authored-by: Keith Massey <keith.massey@elastic.co>
---------
Co-authored-by: Keith Massey <keith.massey@elastic.co>
The keyword doc values field gets an extra sorted doc values field, that encodes the order of how array values were specified at index time. This also captures duplicate values. This is stored in an offset to ordinal array that gets zigzag vint encoded into a sorted doc values field.
For example, in case of the following string array for a keyword field: ["c", "b", "a", "c"].
Sorted set doc values: ["a", "b", "c"] with ordinals: 0, 1 and 2. The offset array will be: [2, 1, 0, 2]
Null values are also supported. For example ["c", "b", null, "c"] results into sorted set doc values: ["b", "c"] with ordinals: 0 and 1. The offset array will be: [1, 0, -1, 1]
Empty arrays are also supported by encoding a zigzag vint array of zero elements.
Limitations:
currently only doc values based array support for keyword field mapper.
multi level leaf arrays are flattened. For example: [[b], [c]] -> [b, c]
arrays are always synthesized as one type. In case of keyword field, [1, 2] gets synthesized as ["1", "2"].
These limitations can be addressed, but some require more complexity and or additional storage.
With this PR, keyword field array will no longer be stored in ignored source, but array offsets are kept track of in an adjacent sorted doc value field. This only applies if index.mapping.synthetic_source_keep is set to arrays (default for logsdb).
Ensures that the DesiredBalanceReconciler always returns a non-empty
AllocationStats object, eliminating edge cases where the stats
available to DesiredBalanceMetrics may not be updated due to some
kind of throttling or the balancer being disabled via cluster
settings.
Adds documentation around
AllocationDecider#canRebalance(RoutingAllocation)
Closes ES-10581
We shouldn't run the post-snapshot-delete cleanup work on the master
thread, since it can be quite expensive and need not block subsequent
cluster state updates. This commit forks it onto a `SNAPSHOT` thread.
This action solely needs the cluster state, it can run on any node.
Additionally, it needs to be cancellable to avoid doing unnecessary work
after a client failure or timeout.
Relates #101805
* Allow setting the `type` in the reroute processor
This allows configuring the `type` from within the ingest `reroute` processor. Similar to `dataset`
and `namespace`, the type defaults to the value extracted from the index name. This means that
documents sent to `logs-mysql.access.default` will have a default value of `logs` for the type.
Resolves#121553
* Update docs/changelog/122409.yaml
This is a high-level overview of the main rebalancing components and
how they interact to move shards around the cluster, and decide where
shards should go.
Relates ES-10423
If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate.
The _metric_names_hash field will be set by the OTel ES exporter.
As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics.
The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created.
That has an impact on the rate aggregation for counters.