Adding hamming distance function to painless for dense_vector fields (#109359)

This adds `hamming` distances, the pop-count of `xor` byte vectors as a
first class citizen in painless. 

For byte vectors, this means that we can compute hamming distances via
script_score (aka, brute-force).

The implementation of `hamming` is the same that is available in Lucene,
and when lucene 9.11 is merged, we should update our logic where
applicable to utilize it.

NOTE: this does not yet add hamming distance as a metric for indexed
vectors. This will be a future PR after the Lucene 9.11 upgrade.
This commit is contained in:
Benjamin Trent 2024-06-17 13:41:20 -04:00 committed by GitHub
parent bbcf73028e
commit acc99302c6
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
18 changed files with 438 additions and 7 deletions

View file

@ -431,6 +431,7 @@ module org.elasticsearch.server {
org.elasticsearch.indices.IndicesFeatures,
org.elasticsearch.action.admin.cluster.allocation.AllocationStatsFeatures,
org.elasticsearch.index.mapper.MapperFeatures,
org.elasticsearch.script.ScriptFeatures,
org.elasticsearch.search.retriever.RetrieversFeatures,
org.elasticsearch.reservedstate.service.FileSettingsFeatures;