mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-29 01:44:36 -04:00
106 lines
3.2 KiB
Text
106 lines
3.2 KiB
Text
[[query-dsl-text-expansion-query]]
|
|
== Text expansion query
|
|
++++
|
|
<titleabbrev>Text expansion</titleabbrev>
|
|
++++
|
|
|
|
The text expansion query uses a {nlp} model to convert the query text into a
|
|
list of token-weight pairs which are then used in a query against a
|
|
<<sparse-vector,sparse vector>> or <<rank-features,rank features>> field.
|
|
|
|
[discrete]
|
|
[[text-expansion-query-ex-request]]
|
|
=== Example request
|
|
|
|
|
|
[source,console]
|
|
----
|
|
GET _search
|
|
{
|
|
"query":{
|
|
"text_expansion":{
|
|
"<sparse_vector_field>":{
|
|
"model_id":"the model to produce the token weights",
|
|
"model_text":"the query string"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[skip: TBD]
|
|
|
|
[discrete]
|
|
[[text-expansion-query-params]]
|
|
=== Top level parameters for `text_expansion`
|
|
|
|
`<sparse_vector_field>`:::
|
|
(Required, object)
|
|
The name of the field that contains the token-weight pairs the NLP model created
|
|
based on the input text.
|
|
|
|
[discrete]
|
|
[[text-expansion-rank-feature-field-params]]
|
|
=== Top level parameters for `<sparse_vector_field>`
|
|
|
|
`model_id`::::
|
|
(Required, string)
|
|
The ID of the model to use to convert the query text into token-weight pairs. It
|
|
must be the same model ID that was used to create the tokens from the input
|
|
text.
|
|
|
|
`model_text`::::
|
|
(Required, string)
|
|
The query text you want to use for search.
|
|
|
|
|
|
[discrete]
|
|
[[text-expansion-query-example]]
|
|
=== Example
|
|
|
|
The following is an example of the `text_expansion` query that references the
|
|
ELSER model to perform semantic search. For a more detailed description of how
|
|
to perform semantic search by using ELSER and the `text_expansion` query, refer
|
|
to <<semantic-search-elser,this tutorial>>.
|
|
|
|
[source,console]
|
|
----
|
|
GET my-index/_search
|
|
{
|
|
"query":{
|
|
"text_expansion":{
|
|
"ml.tokens":{
|
|
"model_id":".elser_model_1",
|
|
"model_text":"How is the weather in Jamaica?"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----
|
|
// TEST[skip: TBD]
|
|
|
|
[discrete]
|
|
[[optimizing-text-expansion]]
|
|
=== Optimizing the search performance of the text_expansion query
|
|
|
|
https://www.elastic.co/blog/faster-retrieval-of-top-hits-in-elasticsearch-with-block-max-wand[Max WAND]
|
|
is an optimization technique used by {es} to skip documents that cannot score
|
|
competitively against the current best matching documents. However, the tokens
|
|
generated by the ELSER model don't work well with the Max WAND optimization.
|
|
Consequently, enabling Max WAND can actually increase query latency for
|
|
`text_expansion`. For datasets of a significant size, disabling Max
|
|
WAND leads to lower query latencies.
|
|
|
|
Max WAND is controlled by the
|
|
<<track-total-hits, track_total_hits>> query parameter. Setting track_total_hits
|
|
to true forces {es} to consider all documents, resulting in lower query
|
|
latencies for the `text_expansion` query. However, other {es} queries run slower
|
|
when Max WAND is disabled.
|
|
|
|
If you are combining the `text_expansion` query with standard text queries in a
|
|
compound search, it is recommended to measure the query performance before
|
|
deciding which setting to use.
|
|
|
|
NOTE: The `track_total_hits` option applies to all queries in the search request
|
|
and may be optimal for some queries but not for others. Take into account the
|
|
characteristics of all your queries to determine the most suitable
|
|
configuration.
|