elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-25 07:37:19 -04:00

Author	SHA1	Message	Date
Jim Ferenczi	17faa89bcc	[8.18] Refactor semantic text field to align with text field behaviour (#119339 ) * Refactor semantic text field to align with text field behaviour (#119183) Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> * fix compil after backport * fix compil after backport (bis) --------- Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co>	2024-12-30 21:23:19 +11:00
Liam Thompson	b24151a3cd	Add documentation for query rules retriever (#115696 ) (#116401 )	2024-11-07 15:28:50 +01:00
Liam Thompson	7b39d3db52	Term Stats documentation (#115933 ) (#116167 ) * Term Stats documentation * Update docs/reference/reranking/learning-to-rank-model-training.asciidoc Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> * Fix query example. --------- Co-authored-by: István Zoltán Szabó <istvan.szabo@elastic.co> (cherry picked from commit `0416812456`) Co-authored-by: Aurélien FOUCRET <aurelien.foucret@gmail.com>	2024-11-04 23:28:12 +11:00
Liam Thompson	8135f95869	[DOCS] Add search and filtering tutorial/quickstart, edit filtering page (#114353 ) (#115738 ) (cherry picked from commit `0d8d8bd392`)	2024-10-28 21:33:32 +11:00
Liam Thompson	bd8b55cc5b	[DOCS] Add text_expansion deprecation usage note (#115529 ) (#115537 ) (cherry picked from commit `6980fc6253`)	2024-10-25 00:39:53 +11:00
Liam Thompson	1883db7f92	Add documentation for minimum_should_match (#113043 ) (#115530 ) (cherry picked from commit `28715b791a`) Co-authored-by: mspielberg <9729801+mspielberg@users.noreply.github.com>	2024-10-25 00:35:31 +11:00
Mike Pellegrini	d4746b50f6	Revert semantic query passage ranking documentation (#113982 ) (#113984 )	2024-10-03 07:44:04 +10:00
Chris Hegarty	45a08b94b3	Upgrade to Lucene 9.12.0 (#113333 ) (#113835 ) This commit upgrades to Lucene 9.12.0. Co-authored-by: Adrien Grand <jpountz@gmail.com> Co-authored-by: Armin Braun <me@obrown.io> Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com> Co-authored-by: John Wagster <john.wagster@elastic.co> Co-authored-by: Luca Cavanna <javanna@apache.org> Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2024-10-01 13:55:02 +01:00
Mike Pellegrini	8ae094fe0e	Add inner hits support to semantic query (#111834 ) (#113693 ) Adds inner hits support to the semantic query through a restricted inner_hits parameter, which exposes from and size from the inner_hits options	2024-09-28 02:20:11 +10:00
Iraklis Psaroudakis	6f63a4e08b	fix a couple of docs typos (#112901 ) (#113283 ) Co-authored-by: Pm Ching <41728178+pionCham@users.noreply.github.com>	2024-09-21 01:59:14 +10:00
Benjamin Trent	96cc923dcf	Update knn-query.asciidoc (#112833 ) (#112868 )	2024-09-13 21:40:59 +10:00
Jim Ferenczi	6ee9801a99	Update the intervals query docs (#111808 ) Since https://github.com/apache/lucene-solr/pull/620, intervals disjunctions are automatically rewritten to handle cases where minimizations can miss valid matches. This change updates the documentation to take this behaviour into account (users don't need to manually pull intervals disjunctions to the top anymore).	2024-08-13 13:39:55 +09:00
Kathleen DeRusso	02c494963a	[Query rules] Add `exclude` query rule type (#111420 ) * Cleanup: Remove pinned IDs from applied rules in favor of single applied docs * Add support for query rules of type exclude, to exclude specified documents from result sets * Support exluded documents that specify the _index as well as the _id * Cleanup * Update docs/changelog/111420.yaml * Update docs * Spotless * PR feedback - docs updates * Apply PR feedback * PR feedback --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2024-08-02 08:03:54 -04:00
Mayya Sharipova	7de305c4ec	Remove 4096 bool query max limit from docs (#111421 ) indices.query.bool.max_clause_count is set automatically and does not default to 4096 as before. This remove mentions of 4096 from query documentations. Relates to PR#91811	2024-07-29 15:20:39 -04:00
István Zoltán Szabó	1a5b008921	[DOCS] Clarifies semantic query behavior on sparse and dense vector fields (#111339 ) * [DOCS] Clarifies semantic query behavior on sparse and dense vector fields. * [DOCS] Adds a NOTE to the semantic query docs.	2024-07-26 16:53:38 +02:00
Carlos Delgado	f29b92cb07	Group vector queries into new section (#110722 )	2024-07-11 14:45:35 +02:00
Kathleen DeRusso	7a1d532ffb	Pass over Sparse Vector docs for correctness (#110282 ) * Remove legacy mentions of text expansion queries * Add missing query_vector param to sparse_vector query docs * Fix formatting errors in sparse vector query dsl doc * Remove unnecessary test setup block	2024-07-02 13:37:25 -04:00
Mike Pellegrini	d288dbf94e	Fix Semantic Query Parameter Formatting (#110355 )	2024-07-02 08:07:35 -04:00
Mayya Sharipova	405e39660b	Support k parameter for knn query (#110233 ) Introduce an optional k param for knn query If k is not set, knn query has the previous behaviour: - `num_candidates` docs is collected from each shard. This `num_candidates` docs are used for combining with results with other queries and aggregations on each shard. - docs from all shards are merged to produce the top global `size` results If k is set, the behaviour instead is following: - `k` docs is collected from each shard. This `k` docs are used for combining results with other queries and aggregations on each shard. - similarly, docs from all shards are merged to produce the top global `size` results. Having `k` param makes it more intuitive for users to address their needs. They also don't need to care and can skip `num_candidates` param for this query as it is of more internal details to tune how knn search operates. Closes #108473	2024-06-28 09:59:28 -04:00
Kathleen DeRusso	19fc0d9cad	Deprecate text_expansion and weighted_tokens queries (#109880 )	2024-06-27 13:24:57 -04:00
Kathleen DeRusso	41a61b069b	Mark Query Rules as GA (#110004 ) * Mark query rules APIs as stable * Remove preview label from docs * Update docs/changelog/110004.yaml	2024-06-21 15:26:51 -04:00
Carlos Delgado	4d3f9f2fb9	Fix RRF example for semantic query (#109516 ) Follow up to https://github.com/elastic/elasticsearch/pull/109433, fix appropriately this time the semantic query example with RRF.	2024-06-10 17:59:13 +10:00
Carlos Delgado	d4d5d9320c	Fix semantic_text retrievers docs example (#109433 )	2024-06-06 16:31:12 +02:00
István Zoltán Szabó	95ce898436	[DOCS] Adds docs to semantic text (#108311 ) Co-authored-by: Carlos Delgado <6339205+carlosdelest@users.noreply.github.com> Co-authored-by: Mike Pellegrini <mike.pellegrini@elastic.co> Co-authored-by: Kathleen DeRusso <kathleen.derusso@elastic.co>	2024-05-31 16:56:07 +02:00
István Zoltán Szabó	1e58f3a485	[DOCS] Fixes sparse vector query docs. (#109153 )	2024-05-29 14:56:59 +02:00
Kathleen DeRusso	7f35f1bed0	Add sparse_vector query (#108254 ) --------- Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2024-05-22 17:06:57 -04:00
Kathleen DeRusso	74d7010a8f	Rename rule query and add support for multiple rulesets (#108831 )	2024-05-22 15:20:34 -04:00
Kathleen DeRusso	a809641179	Fix typo in text_expansion query docs example (#107572 ) * Fix typo in docs example * fix indentation	2024-04-17 14:40:53 -04:00
Liam Thompson	33a71e3289	[DOCS] Refactor book-scoped variables in `docs/reference/index.asciidoc` (#107413 ) * Remove `es-test-dir` book-scoped variable * Remove `plugins-examples-dir` book-scoped variable * Remove `:dependencies-dir:` and `:xes-repo-dir:` book-scoped variables - In `index.asciidoc`, two variables (`:dependencies-dir:` and `:xes-repo-dir:`) were removed. - In `sql/index.asciidoc`, the `:sql-tests:` path was updated to fuller path - In `esql/index.asciidoc`, the `:esql-tests:` path was updated idem * Replace `es-repo-dir` with `es-ref-dir` * Move `:include-xpack: true` to few files that use it, remove from index.asciidoc	2024-04-17 14:37:07 +02:00
Tommaso Teofili	7bff3b3bec	Add modelId and modelText to KnnVectorQueryBuilder (#106068 ) * Add modelId and modelText to KnnVectorQueryBuilder Use QueryVectorBuilder within KnnVectorQueryBuilder to make it possible to perform knn queries also when a query vector is not immediately available. Supplying a text_embedding query_vector_builder with model_text and model_id instead of the query_vector will result in the generation of a query_vector by calling inference on the specified model_id with the supplied model_text (during query rewrite). This is consistent with the way query vectors are built from model_id / model_text in KnnSearchBuilder (DFS phase).	2024-03-18 16:13:38 +01:00
Panagiotis Bailis	d471ccb5bb	Adding support for hex-encoded byte vectors on knn-search (#105393 )	2024-03-13 09:24:51 +02:00
Kathleen DeRusso	bef6363649	Fix typo in text_expansion example (#106265 )	2024-03-12 15:19:21 -04:00
Jack Conradson	68b0acac8f	Add retrievers using the parser-only approach (#105470 ) This enhancement adds a new abstraction to the _search API called "retriever." A retriever is something that returns top hits. This adds three initial retrievers called "standard", "knn", and "rrf". The retrievers use a parser-only approach where they are parsed and then translated into a SearchSourceBuilder to execute the actual search. --------- Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2024-03-12 10:11:55 -07:00
Panagiotis Bailis	7ce8d76559	Making k and num_candidates optional for knn search (#101209 )	2024-02-01 15:43:09 +02:00
Mayya Sharipova	669d4ae9b9	Add hybrid search to knn query documentation (#104562 ) Relates to PR #98916 Closes elastic/search-docs-team#39	2024-01-18 15:53:48 -05:00
Kathleen DeRusso	0570b0baaa	Update text expansion/weighted tokens documentation make examples consistent with clients (#103663 ) * Update text expansion docs and clarify int/float for token pruning config * Fix formatting * Fix tests * Fix tests	2024-01-02 14:21:45 -05:00
Daniel Mitterdorfer	26115fc151	Exists query also works with only doc_values (#103647 ) With this commit we amend the docs for the `exists` query to clarify that it works with either `index` or `doc_values` set to `true` in the mapping. Only if both are disabled, the `exists` query won't work.	2023-12-21 16:33:42 +01:00
Mayya Sharipova	d6c53e03d2	Improve span queries documentation (#103490 ) Improvement includes: 1. Remove reference to Lucene queries (this information is not necessary for Elastic users, and can be outdated) 2. For `span_field_masking` include a node to use "require_field_match" : false parameter for highlighters to work. Closes #101804	2023-12-19 14:51:19 -05:00
Kathleen DeRusso	3520584aac	Add optional pruning config (weighted terms scoring) to text expansion query (#102862 ) Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2023-12-13 14:53:13 -05:00
Mayya Sharipova	b014843078	Return matched_queries in Percolator (#103084 ) Return matched_queries for named queries in Percolator. In a response, each hit together with a `_percolator_document_slot` field will contain `_percolator_document_slot_<slotNumber>_matched_queries` fields that will show which sub-queries matched each percolated document. Closes #10163	2023-12-11 09:07:26 -05:00
Kathleen DeRusso	4dd9e2a772	[Query Rules] Add some usability clarifications to docs (#102990 ) * [Query Rules] Add some usability clarifications to docs * Fix typo	2023-12-06 17:16:56 -05:00
Riahiamirreza	f99b4459d7	Remove redundant character in mlt-query.asciidoc (#102945 )	2023-12-04 14:44:12 -06:00
Kathleen DeRusso	4567d397fa	Clarify text expansion query docs to not suggest enabling track_total_hits for performance (#102102 )	2023-11-20 08:56:26 -05:00
Mayya Sharipova	61c7483fc9	Make knn search a query (#98916 ) This introduced a new knn query: - knn query is executed during the Query phase similar to all other queries. - No k parameter, k defaults to size - num_candidates is a size of queue for candidates to consider while search a graph on each shard - For aggregations: "size" results are collected with total = size * shards. Aggregations will see size * shards results. - All filters from DSL are applied as post-filters, except: 1) alias filter is applied as pre-filter or 2) a filter provided as a parameter inside knn query.	2023-11-01 14:21:40 -04:00
Benjamin Trent	79c0bd277f	Clarify that duplicate _name values for queries in the same request is undefined (#101523 ) relates to: #101480	2023-10-30 14:58:20 -04:00
Mayya Sharipova	e2920cfbb0	Add docs on constant_score_blended rewrite (#101494 ) PR #94494 introduced a new rewrite method from Lucene from 8.8, but no documentation chages were added. This adds a new method to documentation.	2023-10-30 14:42:37 -04:00
Benjamin Trent	d3e9bf02f8	Updating percolate query docs to account for custom similarity limitation (#101386 )	2023-10-27 06:47:13 -04:00
Carlos Delgado	f2dfbfe8c4	[DOCS] Add sparse-vector field type to docs, changed references (#100348 )	2023-10-06 14:25:27 +02:00
Ioana Tagirta	7cd1987e5d	Make _index optional for pinned query docs (#97450 ) Currently pinned queries require either the `ids` or `docs` parameter. `docs` allows pinning documents from specific indices. However for `docs` the `_index` field is always required: ``` GET test/_search { "query": { "pinned": { "organic": { "query_string": { "query": "something" } }, "docs": [ { "_id": "1" } ] } } } ``` returns an error: ``` { "error": { "root_cause": [ { "type": "parsing_exception", "reason": "[10:22] [pinned] failed to parse field [docs]", "line": 10, "col": 22 } ], "type": "parsing_exception", "reason": "[10:22] [pinned] failed to parse field [docs]", "line": 10, "col": 22, "caused_by": { "type": "x_content_parse_exception", "reason": "[10:22] [pinned] failed to parse field [docs]", "caused_by": { "type": "illegal_argument_exception", "reason": "Required [_index]" } } }, "status": 400 } ``` The proposal here is to make `_index` optional. I don't think we have a strong requirement for making `_index` required, when it was initially introduced in https://github.com/elastic/elasticsearch/pull/74873, we mostly wanted the ability to pin docs from specific indices. Making `_index` optional can give more flexibility to use a combination of pinned documents from specific indices or just document ids. This change can also help with pinned query rules. Currently pinned query rules can accept either `ids` or `docs`. If multiple pinned query rules match and they use a combination of `ids` and `docs`, we cannot build a pinned query and we would need to return an error. This is because a pinned query cannot accept both `ids` and `docs`. By making `_index` optional we would no longer need to return an error when pinned query rules use a combination of `ids` and `docs`, because we can easily translate `ids` in `docs`. The following pinned queries would be equivalent: ``` GET test/_search { "query": { "pinned": { "organic": { "query_string": { "query": "something" } }, "docs": [ { "_id": "1" } ] } } } GET test/_search { "query": { "pinned": { "organic": { "query_string": { "query": "something" } }, "ids": [1] } } } ``` The scores should be consistent when using a combination of _docs that might use _index or not - see example <details> <summary>Example </summary> ``` PUT test-1/_doc/1 { "title": "doc 1" } PUT test-1/_doc/2 { "title": "doc 2" } PUT test-2/_doc/1 { "title": "doc 1" } PUT test-2/_doc/3 { "title": "lalala" } POST test-1,test-2/_search { "query": { "pinned": { "organic": { "query_string": { "query": "lalala" } }, "docs": [ { "_id": "2", "_index": "test-1" }, { "_id": "1" } ] } } } ``` response: ``` { "took": 1, "timed_out": false, "_shards": { "total": 2, "successful": 2, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 4, "relation": "eq" }, "max_score": 1.7014124e+38, "hits": [ { "_index": "test-1", "_id": "2", "_score": 1.7014124e+38, "_source": { "title": "doc 2" } }, { "_index": "test-1", "_id": "1", "_score": 1.7014122e+38, // same score as doc with id 1 from test-2 "_source": { "title": "doc 1" } }, { "_index": "test-2", "_id": "1", "_score": 1.7014122e+38, // same score as doc with id 1 from test-1 "_source": { "title": "doc 1" } }, { "_index": "test-2", "_id": "3", "_score": 0.8025915, // organic result "_source": { "title": "lalala" } } ] } } ``` </details> For query rules, if we have two query rules that both match and use a combination of `ids` and `pinned`: ``` PUT _query_rules/test-ruleset { "ruleset_id": "test-ruleset", "rules": [ { "rule_id": "1", "type": "pinned", "criteria": [ { "type": "exact", "metadata": "query_string", "value": "country" } ], "actions": { "docs": [ { "_index": "singers", "_id": "1" } ] } }, { "rule_id": "2", "type": "pinned", "criteria": [ { "type": "exact", "metadata": "query_string", "value": "country" } ], "actions": { "ids": [ 2 ] } } ] } ``` and the following query: ``` POST singers/_search { "query": { "rule_query": { "organic": { "query_string": { "default_field": "name", "query": "country" } }, "match_criteria": { "query_string": "country" }, "ruleset_id": "test-ruleset" } } } ``` then this would get translated into the following pinned query: ``` POST singers/_search { "query": { "pinned": { "organic": { "query_string": { "default_field": "name", "query": "country" } }, "docs": [ { "_index": "singers", "_id": "1" }, {"_id": 2 } ] } } } ``` I think we can also simplify the pinned query rule so that it only receives `docs`: ``` PUT _query_rules/test-ruleset { "ruleset_id": "test-ruleset", "rules": [ { "rule_id": "1", "type": "pinned", "criteria": [ { "type": "exact", "metadata": "query_string", "value": "country" } ], "actions": { "docs": [ { "_id": "1" }, { "_id": "2", "_index": "singers" } ] } } ] } ```	2023-09-07 04:39:56 -04:00
Abdon Pijpelink	8ac9fef3b7	[DOCS] Add 'boost' paramater to match query (#98108 )	2023-08-09 14:28:27 +02:00

1 2 3 4 5 ...

770 commits