mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-04-24 15:17:30 -04:00
`cosine` is our default similarity and should provide a good experience on speed. `dot_product` is faster than `cosine` as it doesn't require calculating vector magnitudes in the similarity comparison loop. Instead, it can assume vectors have a length of `1` and use an optimized `dot_product` calculation. However, `cosine` as it exists today accepts vectors of any magnitude and cannot take advantage of this. This commit addresses this by: - Normalizing all vectors passed when indexing via `cosine` - Storing the calculated magnitude in an additional field (only if its `!= 1`). - Using the `dot_product` Lucene calculation - Normalizing query vectors when used against these new `cosine` fields - De-normalizing vectors when accessed via scripts - Allowing scripts to access these stored magnitudes. |
||
---|---|---|
.. | ||
src/test | ||
build.gradle |