elasticsearch/docs/reference/query-dsl/semantic-query.asciidoc
Carlos Delgado 4d3f9f2fb9
Fix RRF example for semantic query (#109516)
Follow up to https://github.com/elastic/elasticsearch/pull/109433, fix
appropriately this time the semantic query example with RRF.
2024-06-10 17:59:13 +10:00

191 lines
5.2 KiB
Text

[[query-dsl-semantic-query]]
=== Semantic query
++++
<titleabbrev>Semantic</titleabbrev>
++++
beta[]
The `semantic` query type enables you to perform <<semantic-search,semantic search>> on data stored in a <<semantic-text,`semantic_text`>> field.
[discrete]
[[semantic-query-example]]
==== Example request
[source,console]
------------------------------------------------------------
GET my-index-000001/_search
{
"query": {
"semantic": {
"field": "inference_field",
"query": "Best surfing places"
}
}
}
------------------------------------------------------------
// TEST[skip:TBD]
[discrete]
[[semantic-query-params]]
==== Top-level parameters for `semantic`
field::
(Required, string)
The `semantic_text` field to perform the query on.
query::
(Required, string)
The query text to be searched for on the field.
Refer to <<semantic-search-semantic-text,this tutorial>> to learn more about semantic search using `semantic_text` and `semantic` query.
[discrete]
[[hybrid-search-semantic]]
==== Hybrid search with the `semantic` query
The `semantic` query can be used as a part of a hybrid search where the `semantic` query is combined with lexical queries.
For example, the query below finds documents with the `title` field matching "mountain lake", and combines them with results from a semantic search on the field `title_semantic`, that is a `semantic_text` field.
The combined documents are then scored, and the top 3 top scored documents are returned.
[source,console]
------------------------------------------------------------
POST my-index/_search
{
"size" : 3,
"query": {
"bool": {
"should": [
{
"match": {
"title": {
"query": "mountain lake",
"boost": 1
}
}
},
{
"semantic": {
"field": "title_semantic",
"query": "mountain lake",
"boost": 2
}
}
]
}
}
}
------------------------------------------------------------
// TEST[skip:TBD]
You can also use semantic_text as part of <<rrf,Reciprocal Rank Fusion>> to make ranking relevant results easier:
[source,console]
------------------------------------------------------------
GET my-index/_search
{
"retriever": {
"rrf": {
"retrievers": [
{
"standard": {
"query": {
"term": {
"text": "shoes"
}
}
}
},
{
"standard": {
"query": {
"semantic": {
"field": "semantic_field",
"query": "shoes"
}
}
}
}
],
"rank_window_size": 50,
"rank_constant": 20
}
}
}
------------------------------------------------------------
// TEST[skip:TBD]
[discrete]
[[advanced-search]]
=== Advanced search on `semantic_text` fields
The `semantic` query uses default settings for searching on `semantic_text` fields for ease of use.
If you want to fine-tune a search on a `semantic_text` field, you need to know the task type used by the `inference_id` configured in `semantic_text`.
You can find the task type using the <<get-inference-api>>, and check the `task_type` associated with the {infer} service.
Depending on the `task_type`, use either the <<query-dsl-sparse-vector-query,`sparse_vector`>> or the <<query-dsl-knn-query,`knn`>> query for greater flexibility and customization.
[discrete]
[[search-sparse-inference]]
==== Search with `sparse_embedding` inference
When the {infer} endpoint uses a `sparse_embedding` model, you can use a <<query-dsl-sparse-vector-query,`sparse_vector` query>> on a <<semantic-text,`semantic_text`>> field in the following way:
[source,console]
------------------------------------------------------------
GET test-index/_search
{
"query": {
"nested": {
"path": "inference_field.inference.chunks",
"query": {
"sparse_vector": {
"field": "inference_field.inference.chunks.embeddings",
"inference_id": "my-inference-id",
"query": "mountain lake"
}
}
}
}
}
------------------------------------------------------------
// TEST[skip:TBD]
You can customize the `sparse_vector` query to include specific settings, like <<sparse-vector-query-with-pruning-config-and-rescore-example,pruning configuration>>.
[discrete]
[[search-text-inferece]]
==== Search with `text_embedding` inference
When the {infer} endpoint uses a `text_embedding` model, you can use a <<query-dsl-knn-query,`knn` query>> on a `semantic_text` field in the following way:
[source,console]
------------------------------------------------------------
GET test-index/_search
{
"query": {
"nested": {
"path": "inference_field.inference.chunks",
"query": {
"knn": {
"field": "inference_field.inference.chunks.embeddings",
"query_vector_builder": {
"text_embedding": {
"model_id": "my_inference_id",
"model_text": "mountain lake"
}
}
}
}
}
}
}
------------------------------------------------------------
// TEST[skip:TBD]
You can customize the `knn` query to include specific settings, like `num_candidates` and `k`.