There is a DLS query referencing a runtime field loaded from _source, when we create the collector manager we retrieve the numDocs which triggers going through all segments and executing the script for each document. StoredFieldSourceProvider relies on leaf ordinals to build an array, but those ordinals are not populated when computing the numDocs via BaseCompositeReader, as that goes through the subreaders contexts, and not the context leaves (there is a subtle difference that bites us there).
Fixes#111637
* Fix template alias parsing livelock
This commit fixes an issue with templates parsing alias definitions that can cause the ES thread to
hang indefinitely. Due to the malformed alias definition, the parsing gets into a loop which never
exits. In this commit a null check in both the component template and alias parsing code is added,
which prevents the looping.
* Semantic reranking should fail whenever inference ID does not exist
* Short circuit text similarity reranking on empty result set
* Update tests
* Remove test - it doesn't do anything useful
* Update docs/changelog/112038.yaml
* No error for store_array_source in standard mode
* Update docs/changelog/111966.yaml
* nested object test
* restore noop tests
* spotless fix
(cherry picked from commit 9ab8665235)
#111943 unveiled a bug in `collectChilder` where we attempt to collect
the previous doc of the parent, even when the parent doc has no previous
doc.
Fixes#111990, #111991, #111992, #111993
* Explain Function Score Query (#111807)
allowing for a custom explanation to be passed through as part of supporting building a plugin with a custom script score; previously threw an npe
* updated test for 8.15.1
This change ensures that we don't try to compute stats on mappings that don't have dense or sparse vector fields. We don't need to go through all the fields on every segment, instead we can extract the vector fields upfront and limit the work to only indices that define these types.
Closes#111715
* Fix NullPointerException when doing knn search on empty index without dims (#111756)
* Fix NullPointerException when doing knn search on empty index without dims
* Update docs/changelog/111756.yaml
* Fix typo in yaml test
---------
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
(cherry picked from commit 4e26114764)
# Conflicts:
# rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/40_knn_search.yml
* Update 40_knn_search.yml
* Yaml
This reverts #110261 which we can't land until #111757 - we need to be
sure that the `equals` implementations on subclasses of
`InternalAggregations` is correct before this optimization is safe.
Closes#111679
We need to wait a little longer to deal with the case that closing the
`PeerFinder` on the master triggers a disconnect, removing the third
node from the cluster, and requiring another round of discovery to
recover.
Closes#111155
* Ensure vector similarity correctly limits inner_hits returned for nested kNN (#111363)
For nested kNN we support not only similarity thresholds, but also
multi-passage search while retrieving more than one nearest passage.
However, the inner_hits retrieved for the kNN search would ignore the
restricted similarity. Meaning, the inner hits would return all
passages, not just the ones within the limited similarity and this is
confusing.
closes: https://github.com/elastic/elasticsearch/issues/111093
(cherry picked from commit 69c96974de)
* fixing for backport
* adj for backport
* fix compilation for tests
* Replace model_id with inference_id in inference API except when storing ModelConfigs
* Update docs/changelog/111366.yaml
* replace missed literals in tests
Make it clear that this API should be used only if the detailed shard
info is needed and only on ongoing snapshots. Remove incorrectly
mentioned `STATE` value.
* Inject `host.name` field without relying on (component) templates (#110938)
We do not want to rely on templates or component templates to include
the host.name field in indices using LogsDB. The host.name field is a field
we sort on by default when LogsDB is used. As a result, we just inject it
by default, the same way we do for the @timestamp field. This prevents
sorting errors due to missing host.name field in mappings.
The host.name is a keyword field and depending on the value of subobjects it will
be mapped as a name keyword nested inside a host or as a flat host.name keyword.
We also include ignore_above as we normally do for keywords in observability mappings.
* Enable missing hostname test
Non-master-eligible nodes that are already part of a cluster when the master is upgraded don't re-join the cluster, so their cluster features never get updated. This adds a cluster listener that spots this occurring, and manually gets the node's features with a new transport action and updates the cluster state after the fact.
Native preallocation has several issues, introduced in a refactoring for
8.13. First, the native allocator is never even tried, it always decides
to fall back to the Java setLength method. Second, the stat method did
not work correctly on all systems, see #110807. This commit fixes
native preallocate to properly execute on Linux, as well as MacOS. It
also adds direct tests of preallocation.
Note that this is meant as a bugfix for 8.15, so as minimal a change as
possible is made here. The code has completely changed in main. Some
things like the new test and fixes for macos will be forward ported to
main, but I did not want to make larger changes in a bugfix.
So that the only expected disk write at the point of the assertion is from the bulk request. And not from the asynchronous runnable of updateDanglingIndicesInfo().
Fixes#110551
Min/max range for the event.ingested timestamp field (part of Elastic Common
Schema) was added to IndexMetadata in cluster state for searchable snapshots
in #106252.
This commit modifies the search coordinator to rewrite searches to MatchNone
if the query searches a range of event.ingested that, from the min/max range
in cluster state, is known to not overlap. This is the same behavior we currently
have for the @timestamp field.
closes https://github.com/elastic/elasticsearch/issues/110357
With the loosening of what is considered a unit vector, we need to
ensure we only normalize for equality checking if the query vector is
indeed not a unit vector.
(cherry picked from commit fd790ff351)
* Initial commit; setup Gradle; start service
* initial commit
* minor cleanups, builds green; needs tests
* bug fixes; tested working embeddings & completion
* use custom json builder for embeddings request
* Ensure auto-close; fix forbidden API
* start of adding unit tests; abstraction layers
* adding additional tests; cleanups
* add requests unit tests
* all tests created
* fix cohere embeddings response
* fix cohere embeddings response
* fix lint
* better test coverage for secrets; inference client
* update thread-safe syncs; make dims/tokens + int
* add tests for dims and max tokens positive integer
* use requireNonNull;override settings type;cleanups
* use r/w lock for client cache
* remove client reference counting
* update locking in cache; client errors; noop doc
* remove extra block in internalGetOrCreateClient
* remove duplicate dependencies; cleanup
* add fxn to get default embeddings similarity
* use async calls to Amazon Bedrock; cleanups
* use Clock in cache; simplify locking; cleanups
* cleanups around executor; remove some instanceof
* cleanups; use EmbeddingRequestChunker
* move max chunk size to constants
* oof - swapped transport vers w/ master node req
* use XContent instead of Jackson JsonFactory
* remove gradle versions; do not allow dimensions