Support k parameter for knn query (#110233)

Introduce an optional k param for knn query

If k is not set, knn query has the previous behaviour:
- `num_candidates` docs  is collected from each shard. This `num_candidates` docs
are used for combining with results with other queries and aggregations on each shard.
- docs from all shards are merged to produce the top global `size` results

If k is set, the behaviour instead is following:
- `k` docs is collected from each shard. This `k` docs are used for
combining results with other queries and aggregations on each shard.
- similarly, docs from all shards are merged to produce the top global `size`
results.

Having `k` param makes it more intuitive for users to address their needs.
They also don't need to care and can skip `num_candidates` param for this query
as it is of more internal details to tune how knn search operates.

Closes #108473
This commit is contained in:
Mayya Sharipova 2024-06-28 09:59:28 -04:00 committed by GitHub
parent 9938c4adfd
commit 405e39660b
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
29 changed files with 523 additions and 116 deletions

View file

@ -431,6 +431,7 @@ module org.elasticsearch.server {
org.elasticsearch.indices.IndicesFeatures,
org.elasticsearch.action.admin.cluster.allocation.AllocationStatsFeatures,
org.elasticsearch.index.mapper.MapperFeatures,
org.elasticsearch.search.SearchFeatures,
org.elasticsearch.script.ScriptFeatures,
org.elasticsearch.search.retriever.RetrieversFeatures,
org.elasticsearch.reservedstate.service.FileSettingsFeatures;