elasticsearch/docs/reference/mapping/params/similarity.asciidoc
Julie Tibshirani 075d08eb64
Update dense_vector docs with kNN indexing options (#80306)
This commit updates the `dense_vector` docs to include information on the new
`index`, `similarity`, and `index_options` parameters. It also tries to clarify
the difference between `similarity` and `index_options` with the existing
parameters that have the same name.

Relates to #78473.
2021-11-04 11:44:13 -07:00

50 lines
1.5 KiB
Text

[[similarity]]
=== `similarity`
{es} allows you to configure a text scoring algorithm or _similarity_
per field. The `similarity` setting provides a simple way of choosing a
text similarity algorithm other than the default `BM25`, such as `boolean`.
Only text-based field types like <<text,`text`>> and <<keyword,`keyword`>>
support this configuration.
Custom similarities can be configured by tuning the parameters of the built-in
similarities. For more details about this expert options, see the
<<index-modules-similarity,similarity module>>.
The only similarities which can be used out of the box, without any further
configuration are:
`BM25`::
The {wikipedia}/Okapi_BM25[Okapi BM25 algorithm]. The
algorithm used by default in {es} and Lucene.
`boolean`::
A simple boolean similarity, which is used when full-text ranking is not needed
and the score should only be based on whether the query terms match or not.
Boolean similarity gives terms a score equal to their query boost.
The `similarity` can be set on the field level when a field is first created,
as follows:
[source,console]
--------------------------------------------------
PUT my-index-000001
{
"mappings": {
"properties": {
"default_field": { <1>
"type": "text"
},
"boolean_sim_field": {
"type": "text",
"similarity": "boolean" <2>
}
}
}
}
--------------------------------------------------
<1> The `default_field` uses the `BM25` similarity.
<2> The `boolean_sim_field` uses the `boolean` similarity.