mirror of
https://github.com/elastic/elasticsearch.git
synced 2025-06-28 09:28:55 -04:00
Add documentation for query rules retriever (#115696)
* Add initial query rules retriever docs * Add docs tests * Apply suggestions from code review Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> * PR feedback * Make query rules guide retriever-first * Add warning to DSL doc * Update docs/reference/search/retriever.asciidoc Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> * Update docs/reference/search/retriever.asciidoc Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> * Apply suggestions from code review Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> * Give parameters subheading an explicit id * Fix formatting --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com> Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co>
This commit is contained in:
parent
8db918110c
commit
14a7b8fe67
4 changed files with 303 additions and 75 deletions
|
@ -12,6 +12,12 @@
|
|||
The old syntax using `rule_query` and `ruleset_id` is deprecated and will be removed in a future release, so it is strongly advised to migrate existing rule queries to the new API structure.
|
||||
====
|
||||
|
||||
[TIP]
|
||||
====
|
||||
The rule query is not supported for use alongside reranking.
|
||||
If you want to use query rules in conjunction with reranking, use the <<rule-retriever, rule retriever>> instead.
|
||||
====
|
||||
|
||||
Applies <<query-rules-apis,query rules>> to the query before returning results.
|
||||
Query rules can be used to promote documents in the manner of a <<query-dsl-pinned-query>> based on matching defined rules, or to identify specific documents to exclude from a contextual result set.
|
||||
If no matching query rules are defined, the "organic" matches for the query are returned.
|
||||
|
|
|
@ -1,14 +1,12 @@
|
|||
[[retriever]]
|
||||
=== Retriever
|
||||
|
||||
A retriever is a specification to describe top documents returned from a
|
||||
search. A retriever replaces other elements of the <<search-search, search API>>
|
||||
A retriever is a specification to describe top documents returned from a search.
|
||||
A retriever replaces other elements of the <<search-search, search API>>
|
||||
that also return top documents such as <<query-dsl, `query`>> and
|
||||
<<search-api-knn, `knn`>>. A retriever may have child retrievers where a
|
||||
retriever with two or more children is considered a compound retriever. This
|
||||
allows for complex behavior to be depicted in a tree-like structure, called
|
||||
the retriever tree, to better clarify the order of operations that occur
|
||||
during a search.
|
||||
<<search-api-knn, `knn`>>.
|
||||
A retriever may have child retrievers where a retriever with two or more children is considered a compound retriever.
|
||||
This allows for complex behavior to be depicted in a tree-like structure, called the retriever tree, which clarifies the order of operations that occur during a search.
|
||||
|
||||
[TIP]
|
||||
====
|
||||
|
@ -29,6 +27,9 @@ A <<rrf-retriever, retriever>> that produces top documents from <<rrf, reciproca
|
|||
`text_similarity_reranker`::
|
||||
A <<text-similarity-reranker-retriever, retriever>> that enhances search results by re-ranking documents based on semantic similarity to a specified inference text, using a machine learning model.
|
||||
|
||||
`rule`::
|
||||
A <<rule-retriever, retriever>> that applies contextual <<query-rules>> to pin or exclude documents for specific queries.
|
||||
|
||||
[[standard-retriever]]
|
||||
==== Standard Retriever
|
||||
|
||||
|
@ -44,8 +45,7 @@ Defines a query to retrieve a set of top documents.
|
|||
`filter`::
|
||||
(Optional, <<query-dsl, query object or list of query objects>>)
|
||||
+
|
||||
Applies a <<query-dsl-bool-query, boolean query filter>> to this retriever
|
||||
where all documents must match this query but do not contribute to the score.
|
||||
Applies a <<query-dsl-bool-query, boolean query filter>> to this retriever, where all documents must match this query but do not contribute to the score.
|
||||
|
||||
`search_after`::
|
||||
(Optional, <<search-after, search after object>>)
|
||||
|
@ -56,14 +56,13 @@ include::{es-ref-dir}/rest-api/common-parms.asciidoc[tag=terminate_after]
|
|||
|
||||
`sort`::
|
||||
+
|
||||
(Optional, <<sort-search-results, sort object>>)
|
||||
A sort object that that specifies the order of matching documents.
|
||||
(Optional, <<sort-search-results, sort object>>) A sort object that specifies the order of matching documents.
|
||||
|
||||
`min_score`::
|
||||
(Optional, `float`)
|
||||
+
|
||||
Minimum <<relevance-scores, `_score`>> for matching documents. Documents with a
|
||||
lower `_score` are not included in the top documents.
|
||||
Minimum <<relevance-scores, `_score`>> for matching documents.
|
||||
Documents with a lower `_score` are not included in the top documents.
|
||||
|
||||
`collapse`::
|
||||
(Optional, <<collapse-search-results, collapse object>>)
|
||||
|
@ -72,8 +71,7 @@ Collapses the top documents by a specified key into a single top document per ke
|
|||
|
||||
===== Restrictions
|
||||
|
||||
When a retriever tree contains a compound retriever (a retriever with two or more child
|
||||
retrievers) the <<search-after, search after>> parameter is not supported.
|
||||
When a retriever tree contains a compound retriever (a retriever with two or more child retrievers) the <<search-after, search after>> parameter is not supported.
|
||||
|
||||
[discrete]
|
||||
[[standard-retriever-example]]
|
||||
|
@ -105,12 +103,39 @@ POST /restaurants/_bulk?refresh
|
|||
{"region": "Austria", "year": "2020", "vector": [10, 22, 79]}
|
||||
{"index":{}}
|
||||
{"region": "France", "year": "2020", "vector": [10, 22, 80]}
|
||||
|
||||
PUT /movies
|
||||
|
||||
PUT _query_rules/my-ruleset
|
||||
{
|
||||
"rules": [
|
||||
{
|
||||
"rule_id": "my-rule1",
|
||||
"type": "pinned",
|
||||
"criteria": [
|
||||
{
|
||||
"type": "exact",
|
||||
"metadata": "query_string",
|
||||
"values": [ "pugs" ]
|
||||
}
|
||||
],
|
||||
"actions": {
|
||||
"ids": [
|
||||
"id1"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
----
|
||||
// TESTSETUP
|
||||
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
DELETE /restaurants
|
||||
|
||||
DELETE /movies
|
||||
--------------------------------------------------
|
||||
// TEARDOWN
|
||||
////
|
||||
|
@ -143,11 +168,13 @@ GET /restaurants/_search
|
|||
}
|
||||
}
|
||||
----
|
||||
|
||||
<1> Opens the `retriever` object.
|
||||
<2> The `standard` retriever is used for defining traditional {es} queries.
|
||||
<3> The entry point for defining the search query.
|
||||
<4> The `bool` object allows for combining multiple query clauses logically.
|
||||
<5> The `should` array indicates conditions under which a document will match. Documents matching these conditions will increase their relevancy score.
|
||||
<5> The `should` array indicates conditions under which a document will match.
|
||||
Documents matching these conditions will have increased relevancy scores.
|
||||
<6> The `match` object finds documents where the `region` field contains the word "Austria."
|
||||
<7> The `filter` array provides filtering conditions that must be met but do not contribute to the relevancy score.
|
||||
<8> The `term` object is used for exact matches, in this case, filtering documents by the `year` field.
|
||||
|
@ -178,8 +205,8 @@ Defines a <<knn-semantic-search, model>> to build a query vector.
|
|||
`k`::
|
||||
(Required, integer)
|
||||
+
|
||||
Number of nearest neighbors to return as top hits. This value must be fewer than
|
||||
or equal to `num_candidates`.
|
||||
Number of nearest neighbors to return as top hits.
|
||||
This value must be fewer than or equal to `num_candidates`.
|
||||
|
||||
`num_candidates`::
|
||||
(Required, integer)
|
||||
|
@ -222,16 +249,15 @@ GET /restaurants/_search
|
|||
<1> Configuration for k-nearest neighbor (knn) search, which is based on vector similarity.
|
||||
<2> Specifies the field name that contains the vectors.
|
||||
<3> The query vector against which document vectors are compared in the `knn` search.
|
||||
<4> The number of nearest neighbors to return as top hits. This value must be fewer than or equal to `num_candidates`.
|
||||
<4> The number of nearest neighbors to return as top hits.
|
||||
This value must be fewer than or equal to `num_candidates`.
|
||||
<5> The size of the initial candidate set from which the final `k` nearest neighbors are selected.
|
||||
|
||||
[[rrf-retriever]]
|
||||
==== RRF Retriever
|
||||
|
||||
An <<rrf, RRF>> retriever returns top documents based on the RRF formula,
|
||||
equally weighting two or more child retrievers.
|
||||
Reciprocal rank fusion (RRF) is a method for combining multiple result
|
||||
sets with different relevance indicators into a single result set.
|
||||
An <<rrf, RRF>> retriever returns top documents based on the RRF formula, equally weighting two or more child retrievers.
|
||||
Reciprocal rank fusion (RRF) is a method for combining multiple result sets with different relevance indicators into a single result set.
|
||||
|
||||
===== Parameters
|
||||
|
||||
|
@ -357,7 +383,8 @@ Refer to <<semantic-reranking>> for a high level overview of semantic re-ranking
|
|||
===== Prerequisites
|
||||
|
||||
To use `text_similarity_reranker` you must first set up a `rerank` task using the <<put-inference-api, Create {infer} API>>.
|
||||
The `rerank` task should be set up with a machine learning model that can compute text similarity. Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third-party text similarity models supported by {es}.
|
||||
The `rerank` task should be set up with a machine learning model that can compute text similarity.
|
||||
Refer to {ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-text-similarity[the Elastic NLP model reference] for a list of third-party text similarity models supported by {es}.
|
||||
|
||||
Currently you can:
|
||||
|
||||
|
@ -368,6 +395,7 @@ Currently you can:
|
|||
** Refer to the <<text-similarity-reranker-retriever-example-eland,example>> on this page for a step-by-step guide.
|
||||
|
||||
===== Parameters
|
||||
|
||||
`retriever`::
|
||||
(Required, <<retriever, retriever>>)
|
||||
+
|
||||
|
@ -376,7 +404,8 @@ The child retriever that generates the initial set of top documents to be re-ran
|
|||
`field`::
|
||||
(Required, `string`)
|
||||
+
|
||||
The document field to be used for text similarity comparisons. This field should contain the text that will be evaluated against the `inferenceText`.
|
||||
The document field to be used for text similarity comparisons.
|
||||
This field should contain the text that will be evaluated against the `inferenceText`.
|
||||
|
||||
`inference_id`::
|
||||
(Required, `string`)
|
||||
|
@ -391,25 +420,28 @@ The text snippet used as the basis for similarity comparison.
|
|||
`rank_window_size`::
|
||||
(Optional, `int`)
|
||||
+
|
||||
The number of top documents to consider in the re-ranking process. Defaults to `10`.
|
||||
The number of top documents to consider in the re-ranking process.
|
||||
Defaults to `10`.
|
||||
|
||||
`min_score`::
|
||||
(Optional, `float`)
|
||||
+
|
||||
Sets a minimum threshold score for including documents in the re-ranked results. Documents with similarity scores below this threshold will be excluded. Note that score calculations vary depending on the model used.
|
||||
Sets a minimum threshold score for including documents in the re-ranked results.
|
||||
Documents with similarity scores below this threshold will be excluded.
|
||||
Note that score calculations vary depending on the model used.
|
||||
|
||||
`filter`::
|
||||
(Optional, <<query-dsl, query object or list of query objects>>)
|
||||
+
|
||||
Applies the specified <<query-dsl-bool-query, boolean query filter>> to the child <<retriever, retriever>>.
|
||||
If the child retriever already specifies any filters, then this top-level filter is applied in conjuction
|
||||
with the filter defined in the child retriever.
|
||||
If the child retriever already specifies any filters, then this top-level filter is applied in conjuction with the filter defined in the child retriever.
|
||||
|
||||
[discrete]
|
||||
[[text-similarity-reranker-retriever-example-cohere]]
|
||||
==== Example: Cohere Rerank
|
||||
|
||||
This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API. This approach eliminate the need to generate and store embeddings for all indexed documents.
|
||||
This example enables out-of-the-box semantic search by re-ranking top documents using the Cohere Rerank API.
|
||||
This approach eliminates the need to generate and store embeddings for all indexed documents.
|
||||
This requires a <<infer-service-cohere,Cohere Rerank inference endpoint>> using the `rerank` task type.
|
||||
|
||||
[source,console]
|
||||
|
@ -459,7 +491,9 @@ Follow these steps to load the model and create a semantic re-ranker.
|
|||
python -m pip install eland[pytorch]
|
||||
----
|
||||
+
|
||||
. Upload the model to {es} using Eland. This example assumes you have an Elastic Cloud deployment and an API key. Refer to the https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch-auth[Eland documentation] for more authentication options.
|
||||
. Upload the model to {es} using Eland.
|
||||
This example assumes you have an Elastic Cloud deployment and an API key.
|
||||
Refer to the https://www.elastic.co/guide/en/elasticsearch/client/eland/current/machine-learning.html#ml-nlp-pytorch-auth[Eland documentation] for more authentication options.
|
||||
+
|
||||
[source,sh]
|
||||
----
|
||||
|
@ -517,14 +551,142 @@ POST movies/_search
|
|||
This retriever uses a standard `match` query to search the `movie` index for films tagged with the genre "drama".
|
||||
It then re-ranks the results based on semantic similarity to the text in the `inference_text` parameter, using the model we uploaded to {es}.
|
||||
|
||||
[[rule-retriever]]
|
||||
==== Query Rules Retriever
|
||||
|
||||
The `rule` retriever enables fine-grained control over search results by applying contextual <<query-rules>> to pin or exclude documents for specific queries.
|
||||
This retriever has similar functionality to the <<query-dsl-rule-query>>, but works out of the box with other retrievers.
|
||||
|
||||
===== Prerequisites
|
||||
|
||||
To use the `rule` retriever you must first create one or more query rulesets using the <<query-rules-apis, query rules management APIs>>.
|
||||
|
||||
[discrete]
|
||||
[[rule-retriever-parameters]]
|
||||
===== Parameters
|
||||
|
||||
`retriever`::
|
||||
(Required, <<retriever, retriever>>)
|
||||
+
|
||||
The child retriever that returns the results to apply query rules on top of.
|
||||
This can be a standalone retriever such as the <<standard-retriever, standard>> or <<knn-retriever, knn>> retriever, or it can be a compound retriever.
|
||||
|
||||
`ruleset_ids`::
|
||||
(Required, `array`)
|
||||
+
|
||||
An array of one or more unique <<query-rules-apis, query ruleset>> IDs with query-based rules to match and apply as applicable.
|
||||
Rulesets and their associated rules are evaluated in the order in which they are specified in the query and ruleset.
|
||||
The maximum number of rulesets to specify is 10.
|
||||
|
||||
`match_criteria`::
|
||||
(Required, `object`)
|
||||
+
|
||||
Defines the match criteria to apply to rules in the given query ruleset(s).
|
||||
Match criteria should match the keys defined in the `criteria.metadata` field of the rule.
|
||||
|
||||
`rank_window_size`::
|
||||
(Optional, `int`)
|
||||
+
|
||||
The number of top documents to return from the `rule` retriever.
|
||||
Defaults to `10`.
|
||||
|
||||
[discrete]
|
||||
[[rule-retriever-example]]
|
||||
==== Example: Rule retriever
|
||||
|
||||
This example shows the rule retriever executed without any additional retrievers.
|
||||
It runs the query defined by the `retriever` and applies the rules from `my-ruleset` on top of the returned results.
|
||||
|
||||
[source,console]
|
||||
----
|
||||
GET movies/_search
|
||||
{
|
||||
"retriever": {
|
||||
"rule": {
|
||||
"match_criteria": {
|
||||
"query_string": "harry potter"
|
||||
},
|
||||
"ruleset_ids": [
|
||||
"my-ruleset"
|
||||
],
|
||||
"retriever": {
|
||||
"standard": {
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "harry potter"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
[discrete]
|
||||
[[rule-retriever-example-rrf]]
|
||||
==== Example: Rule retriever combined with RRF
|
||||
|
||||
This example shows how to combine the `rule` retriever with other rerank retrievers such as <<rrf-retriever, rrf>> or <<text-similarity-reranker-retriever, text_similarity_reranker>>.
|
||||
|
||||
[WARNING]
|
||||
====
|
||||
The `rule` retriever will apply rules to any documents returned from its defined `retriever` or any of its sub-retrievers.
|
||||
This means that for the best results, the `rule` retriever should be the outermost defined retriever.
|
||||
Nesting a `rule` retriever as a sub-retriever under a reranker such as `rrf` or `text_similarity_reranker` may not produce the expected results.
|
||||
====
|
||||
|
||||
[source,console]
|
||||
----
|
||||
GET movies/_search
|
||||
{
|
||||
"retriever": {
|
||||
"rule": { <1>
|
||||
"match_criteria": {
|
||||
"query_string": "harry potter"
|
||||
},
|
||||
"ruleset_ids": [
|
||||
"my-ruleset"
|
||||
],
|
||||
"retriever": {
|
||||
"rrf": { <2>
|
||||
"retrievers": [
|
||||
{
|
||||
"standard": {
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "sorcerer's stone"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"standard": {
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "chamber of secrets"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
<1> The `rule` retriever is the outermost retriever, applying rules to the search results that were previously reranked using the `rrf` retriever.
|
||||
<2> The `rrf` retriever returns results from all of its sub-retrievers, and the output of the `rrf` retriever is used as input to the `rule` retriever.
|
||||
|
||||
==== Using `from` and `size` with a retriever tree
|
||||
|
||||
The <<search-from-param, `from`>> and <<search-size-param, `size`>>
|
||||
parameters are provided globally as part of the general
|
||||
<<search-search, search API>>. They are applied to all retrievers in a
|
||||
retriever tree unless a specific retriever overrides the `size` parameter
|
||||
using a different parameter such as `rank_window_size`. Though, the final
|
||||
search hits are always limited to `size`.
|
||||
<<search-search, search API>>.
|
||||
They are applied to all retrievers in a retriever tree, unless a specific retriever overrides the `size` parameter using a different parameter such as `rank_window_size`.
|
||||
Though, the final search hits are always limited to `size`.
|
||||
|
||||
==== Using aggregations with a retriever tree
|
||||
|
||||
|
@ -534,8 +696,8 @@ clauses in a <<query-dsl-bool-query, boolean query>>.
|
|||
|
||||
==== Restrictions on search parameters when specifying a retriever
|
||||
|
||||
When a retriever is specified as part of a search the following elements are not allowed
|
||||
at the top-level and instead are only allowed as elements of specific retrievers:
|
||||
When a retriever is specified as part of a search, the following elements are not allowed at the top-level.
|
||||
Instead they are only allowed as elements of specific retrievers:
|
||||
|
||||
* <<request-body-search-query, `query`>>
|
||||
* <<search-api-knn, `knn`>>
|
||||
|
@ -543,3 +705,4 @@ at the top-level and instead are only allowed as elements of specific retrievers
|
|||
* <<request-body-search-terminate-after, `terminate_after`>>
|
||||
* <<search-sort-param, `sort`>>
|
||||
* <<rescore, `rescore`>>
|
||||
|
||||
|
|
|
@ -16,22 +16,21 @@ For implementation details, including notable restrictions, check out the
|
|||
Retrievers come in various types, each tailored for different search operations.
|
||||
The following retrievers are currently available:
|
||||
|
||||
* <<standard-retriever,*Standard Retriever*>>. Returns top documents from a
|
||||
traditional https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl.html[query].
|
||||
Mimics a traditional query but in the context of a retriever framework. This
|
||||
ensures backward compatibility as existing `_search` requests remain supported.
|
||||
That way you can transition to the new abstraction at your own pace without
|
||||
mixing syntaxes.
|
||||
* <<knn-retriever,*kNN Retriever*>>. Returns top documents from a <<search-api-knn,knn search>>,
|
||||
in the context of a retriever framework.
|
||||
* <<rrf-retriever,*RRF Retriever*>>. Combines and ranks multiple first-stage retrievers using
|
||||
the reciprocal rank fusion (RRF) algorithm. Allows you to combine multiple result sets
|
||||
with different relevance indicators into a single result set.
|
||||
An RRF retriever is a *compound retriever*, where its `filter` element is
|
||||
propagated to its sub retrievers.
|
||||
+
|
||||
|
||||
* <<text-similarity-reranker-retriever,*Text Similarity Re-ranker Retriever*>>. Used for <<semantic-reranking,semantic reranking>>.
|
||||
* <<standard-retriever,*Standard Retriever*>>.
|
||||
Returns top documents from a traditional https://www.elastic.co/guide/en/elasticsearch/reference/master/query-dsl.html[query].
|
||||
Mimics a traditional query but in the context of a retriever framework.
|
||||
This ensures backward compatibility as existing `_search` requests remain supported.
|
||||
That way you can transition to the new abstraction at your own pace without mixing syntaxes.
|
||||
* <<knn-retriever,*kNN Retriever*>>.
|
||||
Returns top documents from a <<search-api-knn,knn search>>, in the context of a retriever framework.
|
||||
* <<rrf-retriever,*RRF Retriever*>>.
|
||||
Combines and ranks multiple first-stage retrievers using the reciprocal rank fusion (RRF) algorithm.
|
||||
Allows you to combine multiple result sets with different relevance indicators into a single result set.
|
||||
An RRF retriever is a *compound retriever*, where its `filter` element is propagated to its sub retrievers.
|
||||
* <<rule-retriever,*Rule Retriever*>>.
|
||||
Applies <<query-rules,query rules>> to the query before returning results.
|
||||
* <<text-similarity-reranker-retriever,*Text Similarity Re-ranker Retriever*>>.
|
||||
Used for <<semantic-reranking,semantic reranking>>.
|
||||
Requires first creating a `rerank` task using the <<put-inference-api,{es} Inference API>>.
|
||||
|
||||
[discrete]
|
||||
|
@ -69,8 +68,11 @@ When using compound retrievers, only the query element is allowed, which enforce
|
|||
[[retrievers-overview-example]]
|
||||
==== Example
|
||||
|
||||
The following example demonstrates the powerful queries that we can now compose, and how retrievers simplify this process. We can use any combination of retrievers we want, propagating the
|
||||
results of a nested retriever to its parent. In this scenario, we'll make use of all 4 (currently) available retrievers, i.e. `standard`, `knn`, `text_similarity_reranker` and `rrf`.
|
||||
The following example demonstrates the powerful queries that we can now compose, and how retrievers simplify this process.
|
||||
We can use any combination of retrievers we want, propagating the results of a nested retriever to its parent.
|
||||
In this scenario, we'll make use of 4 of our currently available retrievers, i.e. `standard`, `knn`, `text_similarity_reranker` and `rrf`.
|
||||
See <<retrievers-overview-types>> for the complete list of available retrievers.
|
||||
|
||||
We'll first combine the results of a `semantic` query using the `standard` retriever, and that of a `knn` search on a dense vector field, using `rrf` to get the top 100 results.
|
||||
Finally, we'll then rerank the top-50 results of `rrf` using the `text_similarity_reranker`
|
||||
|
||||
|
@ -126,15 +128,18 @@ GET example-index/_search
|
|||
|
||||
Here are some important terms:
|
||||
|
||||
* *Retrieval Pipeline*. Defines the entire retrieval and ranking logic to
|
||||
produce top hits.
|
||||
* *Retriever Tree*. A hierarchical structure that defines how retrievers interact.
|
||||
* *First-stage Retriever*. Returns an initial set of candidate documents.
|
||||
* *Compound Retriever*. Builds on one or more retrievers,
|
||||
enhancing document retrieval and ranking logic.
|
||||
* *Combiners*. Compound retrievers that merge top hits
|
||||
from multiple sub-retrievers.
|
||||
* *Rerankers*. Special compound retrievers that reorder hits and may adjust the number of hits, with distinctions between first-stage and second-stage rerankers.
|
||||
* *Retrieval Pipeline*.
|
||||
Defines the entire retrieval and ranking logic to produce top hits.
|
||||
* *Retriever Tree*.
|
||||
A hierarchical structure that defines how retrievers interact.
|
||||
* *First-stage Retriever*.
|
||||
Returns an initial set of candidate documents.
|
||||
* *Compound Retriever*.
|
||||
Builds on one or more retrievers, enhancing document retrieval and ranking logic.
|
||||
* *Combiners*.
|
||||
Compound retrievers that merge top hits from multiple sub-retrievers.
|
||||
* *Rerankers*.
|
||||
Special compound retrievers that reorder hits and may adjust the number of hits, with distinctions between first-stage and second-stage rerankers.
|
||||
|
||||
[discrete]
|
||||
[[retrievers-overview-play-in-search]]
|
||||
|
|
|
@ -10,7 +10,7 @@ _Query rules_ allow customization of search results for queries that match speci
|
|||
This allows for more control over results, for example ensuring that promoted documents that match defined criteria are returned at the top of the result list.
|
||||
Metadata is defined in the query rule, and is matched against the query criteria.
|
||||
Query rules use metadata to match a query.
|
||||
Metadata is provided as part of the <<query-dsl-rule-query, rule query>> as an object and can be anything that helps differentiate the query, for example:
|
||||
Metadata is provided as part of the search request as an object and can be anything that helps differentiate the query, for example:
|
||||
|
||||
* A user-entered query string
|
||||
* Personalized metadata about users (e.g. country, language, etc)
|
||||
|
@ -18,13 +18,13 @@ Metadata is provided as part of the <<query-dsl-rule-query, rule query>> as an o
|
|||
* A referring site
|
||||
* etc.
|
||||
|
||||
Query rules define a metadata key that will be used to match the metadata provided in the <<query-dsl-rule-query, rule query>> with the criteria specified in the rule.
|
||||
Query rules define a metadata key that will be used to match the metadata provided in the <<rule-retriever, rule retriever>> with the criteria specified in the rule.
|
||||
|
||||
When a query rule matches the <<query-dsl-rule-query, rule query>> metadata according to its defined criteria, the query rule action is applied to the underlying `organic` query.
|
||||
When a query rule matches the rule metadata according to its defined criteria, the query rule action is applied to the underlying `organic` query.
|
||||
|
||||
For example, a query rule could be defined to match a user-entered query string of `pugs` and a country `us` and promote adoptable shelter dogs if the rule query met both criteria.
|
||||
|
||||
Rules are defined using the <<query-rules-apis, query rules API>> and searched using the <<query-dsl-rule-query,rule query>>.
|
||||
Rules are defined using the <<query-rules-apis, query rules API>> and searched using the <<rule-retriever, rule retriever>> or the <<query-dsl-rule-query,rule query>>.
|
||||
|
||||
[discrete]
|
||||
[[query-rule-definition]]
|
||||
|
@ -189,9 +189,11 @@ You can use the <<get-query-ruleset>> call to retrieve the ruleset you just crea
|
|||
|
||||
[discrete]
|
||||
[[rule-query-search]]
|
||||
==== Perform a rule query
|
||||
==== Search using query rules
|
||||
|
||||
Once you have defined one or more query rulesets, you can search using these rulesets using the <<rule-retriever, rule retriever>> or the <<query-dsl-rule-query, rule query>>.
|
||||
Retrievers are the recommended way to use rule queries, as they will work out of the box with other reranking retrievers such as <<rrf>>.
|
||||
|
||||
Once you have defined one or more query rulesets, you can search these rulesets using the <<query-dsl-rule-query>> query.
|
||||
Rulesets are evaluated in order, so rules in the first ruleset you specify will be applied before any subsequent rulesets.
|
||||
|
||||
An example query for the `my-ruleset` defined above is:
|
||||
|
@ -200,18 +202,22 @@ An example query for the `my-ruleset` defined above is:
|
|||
----
|
||||
GET /my-index-000001/_search
|
||||
{
|
||||
"query": {
|
||||
"retriever": {
|
||||
"rule": {
|
||||
"organic": {
|
||||
"query_string": {
|
||||
"query": "puggles"
|
||||
"retriever": {
|
||||
"standard": {
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "puggles"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"match_criteria": {
|
||||
"query_string": "puggles",
|
||||
"user_country": "us"
|
||||
},
|
||||
"ruleset_ids": ["my-ruleset"]
|
||||
"ruleset_ids": [ "my-ruleset" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -227,3 +233,51 @@ In this case, the rules are applied in the following order:
|
|||
- Where the matching rule appears in the ruleset
|
||||
- If multiple documents are specified in a single rule, in the order they are specified
|
||||
- If a document is matched by both a `pinned` rule and an `exclude` rule, the `exclude` rule will take precedence
|
||||
|
||||
You can specify reranking retrievers such as <<rrf-retriever, rrf>> or <<text-similarity-reranker-retriever, text_similarity_reranker>> in the rule query to apply query rules on already-reranked results.
|
||||
Here is an example:
|
||||
|
||||
[source,console]
|
||||
----
|
||||
GET my-index-000001/_search
|
||||
{
|
||||
"retriever": {
|
||||
"rule": {
|
||||
"match_criteria": {
|
||||
"query_string": "puggles",
|
||||
"user_country": "us"
|
||||
},
|
||||
"ruleset_ids": [
|
||||
"my-ruleset"
|
||||
],
|
||||
"retriever": {
|
||||
"rrf": {
|
||||
"retrievers": [
|
||||
{
|
||||
"standard": {
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "pugs"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
"standard": {
|
||||
"query": {
|
||||
"query_string": {
|
||||
"query": "puggles"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
----
|
||||
// TEST[continued]
|
||||
|
||||
This will apply pinned and excluded query rules on top of the content that was reranked by RRF.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue