elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-29 09:54:06 -04:00

Author	SHA1	Message	Date
Michael Peterson	eaa86796a7	Add completion_time time field to async_search get and status response (#97700 ) The completion_time is set as the start_time (already present) plus the 'took' time that is set in the SearchResponse object and only if the isRunning status == false since took is set even for in-progress searches. We use the 'took' field because it is based on relative time, not absolute wall clock time which can go backwards due to NTP issues. See the comments in TransportSearchAction about the SearchTimeProvider for details. Closes #88640	2023-07-17 09:13:15 -04:00
Mayya Sharipova	f8c626f792	Track max_score in collapse when requested (#97703 ) Before we used to track max_score in collapse when requested (track_scores=true) or when there is no sort in collapse (see PR#27122). But this feature was lost through refactoring and changes. This PR restores this feature. Closes #97653	2023-07-17 06:48:00 -04:00
Abdon Pijpelink	0f810b19e9	[DOCS] Clarify that dense vectors can be created with ES (#97636 ) * [DOCS] Clarify that dense vectors can be created with ES * Fix rendering issue * Break up long sentence	2023-07-13 14:04:32 +02:00
István Zoltán Szabó	9cd609f22c	[DOCS] Adds deployment_id as an option to query_vector_builder (#97576 )	2023-07-12 09:35:36 +02:00
Jack Conradson	f2b0434ee2	Mark rank and sub_searches as tech preview (#97573 ) rank and sub_searches are in tech preview. This adds the tech preview text that is required in the docs for these features.	2023-07-11 09:28:46 -07:00
Marc-Antoine Leclercq	b1d150babf	Fix typo on semantic-search-elser.asciidoc (#97551 ) MACRO => MARCO	2023-07-11 11:52:26 +02:00
Luca Cavanna	7df388df64	Make terminate_after early termination friendly (#97540 ) There are situations in which the terminate_after functionality causes the collection to keep on going although there is nothing to collect, with the only goal of incrementing the counter of collected docs and eventually early terminating which sets the `terminated_early` flag in the search response to true. When docs collection early terminates, we should rather honor the corresponding `CollectionTerminatedException` that is thrown, and adjust expectations around the fact that `terminate_after` affects actual collection of documents, meaning that it can't be honored if the threshold has not been reached by the team the collection early terminates for other reasons. This commit adjust the QueryPhaseCollector behavior to do that, which allows for some additional simplifications. Closes #97269	2023-07-11 10:14:12 +02:00
Michael Peterson	6dd1841dbc	Allow users to run the painless execute API on a remote cluster shard (#97335 ) Added a clusterAlias to the Painless execute Request object, so that index expressions in the request of the form "myremote:myindex" will be parsed to set clusterAlias to "myremote" and the index to "myindex". If clusterAlias is null, then it is executed against a shard on the local cluster, as before. If clusterAlias is non-null, then the SingleShardTransportAction is sent to the remote cluster, where it will run the full request (doing remote coordination). Note that the new clusterAlias field is not Writeable so that when it is sent to the remote cluster it will only see the index name, not the clusterAlias (which it wouldn't know how to handle correctly). Added PainlessExecuteIT test that tests cross-cluster calls Updated painless-execute-script end user docs to indicate support for cross-cluster executions	2023-07-10 12:27:00 -04:00
Christoph Büscher	192597d795	Limit _terms_enum prefix size (#97488 ) Currently the prefix size of the _terms_enum endpoint are not limited in size. Since they run against a keyword field and build automata, this can lead to high memory consumption and the danger of running OOM. This change check the size of the prefix early in the rest request and throw a validation error in case it exceeds IndexWriter.MAX_TERM_LENGTH, which is the same limit we apply to the length of keyword field values anyway, so this comes at no loss in functionality. Closes #96572	2023-07-10 12:21:07 +02:00
Luca Cavanna	f5a2af6c71	Query phase: fold collector wrappers into a single top level collector (#97030 ) The query phase uses a number of different collectors and combines them together, pretty much one per feature that the search API exposes: there is a collector for post_filter, one for min_score, one for terminate_after, one for aggs. While this is very flexible, we always combine such collectors together in the same way (e.g. terminate_after must be the first one, post_filter is only applied to top docs collection, min score is applied to both aggs and top docs). This means that despite we could flexibly compose collectors, we need to apply each feature predictably which makes the composability not needed. Furthermore, composability causes complexity. The terminate_after functionality is a clear example of complexity introduced as a consequence of having a complex collector tree: it relies on a multi collector, and throws an exception to force terminating the collection for all other collectors in the tree. If there was a single collector aware of post_filter, min_score and terminate_after at the same time, we could simply reuse Lucene mechanisms to early terminate the collection (CollectionTerminatedException) instead of forcing the termination throwing an exception that Lucene does not handle. Furthermore, MultiCollector is a complex and generic collector to combine multiple collectors together, while we always every combine maximum two collectors with it, which are more or less fixed (e.g. top docs and aggs). This PR introduces a new top-level collector that is inspired by MultiCollector in that it holds the top docs and the optional aggs collector and applies post_filter, min_score as well as terminate_after as part of its execution. This allows us to have a specialized collector for our needs, less flexibility and more control. This surfaced some strange behaviour that we may want to change as a follow-up in how terminate_after makes us collecting docs even when all possible collections have been early terminated. The goal of this PR though is to have feature parity with query phase before the refactoring, without any change of behaviour. A nice benefit of this work is that it allows us to rely on CollectionTerminatedException for the terminate_after functionality. This simplifies the introduction of multi-threaded collector managers when it comes to handling exceptions.	2023-06-30 12:48:13 +02:00
James Rodewig	ff84ad1469	[DOCS] Note license requirements for CCS (#97252 ) Notes that CCS requires both clusters to use the same license level for full capabilities.	2023-06-29 16:55:10 -04:00
Jack Conradson	bca4995fc8	Add basic documentation for sub searches (#97025 ) This adds basic documentation for the sub_searches top-level element in the search API. (#96224).	2023-06-28 07:02:38 -07:00
István Zoltán Szabó	a62402ce96	[DOCS] Adjusts the note about minimum recommended node size on the ELSER tutorial page (#97083 ) Co-authored-by: David Roberts <dave.roberts@elastic.co>	2023-06-26 11:09:18 +02:00
Michael Peterson	afbf1f5ca1	Profile API should show node details as well as shard details (#96396 ) Added additional fields to SearchProfileResults for XContent output: node_id, cluster, index, shard_id. It parses the existing composite ID using the new parseProfileShardId method, which reverses the SeachShardTarget.toString method. No new information is added here, merely the splitting out of the four pieces of information in the profile shards "composite" id that is created by the SeachShardTarget.toString method. Profile/shards output now has the form: ``` "profile": { "shards": [ { "id": "[2m7SW9oIRrirdrwirM1mwQ][blogs][0]", "node_id": "2m7SW9oIRrirdrwirM1mwQ", "shard_id": "0", "index": "blogs", "cluster": "(local)", "searches": [ ... ] ... }, { "id": "[UngEVXTBQL-7w5j_tftGAQ][remote1:blogs][2]", "node_id": "UngEVXTBQL-7w5j_tftGAQ", "shard_id": "2", "index": "blogs", "cluster": "remote1", "searches": [ ... ] ... ``` where the latter is on a remote cluster and you can see that as the prefix on the index name. Partially addresses #25896 Added yamlRestTest for the new fields in the profile response.	2023-06-24 14:12:25 -04:00
István Zoltán Szabó	27dec1a605	[DOCS] Adds note to the tutorial about the recommended ML node size for ELSER. (#96880 )	2023-06-15 18:03:41 +02:00
István Zoltán Szabó	0469fe5f3e	[DOCS] Makes ELSER mapping requirements clearer (#96854 ) Makes ELSER mapping requirements clearer.	2023-06-15 11:27:45 +02:00
István Zoltán Szabó	80bc048aaf	[DOCS] Adds size parameter to reindex call in ELSER tutorial. (#96820 )	2023-06-14 13:55:46 +02:00
István Zoltán Szabó	656d367e8d	[DOCS] Removes the technical preview admonition from query_vector_builder docs. (#96735 )	2023-06-12 09:55:39 +02:00
Michael Peterson	110b1a686e	Add end-user documentation for CCS using async-search (#96507 ) Added documentation to search-across-clusters.asciidoc showing that async-search can now support the ccs_minimize_roundtrips=true flag and how it behaves relative to async CCS when ccs_minimize_roundtrips=true. I also updated the "Don't minimize network roundtrips" section to reflect the fact that the REST based Search Shards API is no longer called but rather an internal transport-layer only version of search_shards.	2023-06-09 08:55:38 -04:00
István Zoltán Szabó	53c082b5aa	[DOCS] Fixes field name in text_expansion query. (#96724 )	2023-06-09 11:43:46 +02:00
Luca Cavanna	2b67a45fc2	[DOCS] Remove leftover experimental tag for knn search (#96722 ) Knn search was made GA in Elasticsearch 8.5, see #91065 . This commit removes a leftover experimental marking from the search docs.	2023-06-09 11:10:03 +02:00
István Zoltán Szabó	890dd08df0	[DOCS] Adds a compound query example to the ELSER semantic search tutorial (#96460 ) Co-authored-by: David Kyle <david.kyle@elastic.co>	2023-06-07 10:19:24 +02:00
Ignacio Vera	15a6aca060	update docs for vector tile and geohex (#96595 ) Geohex aggregation is now supported since Elasticsearch 8.7 for geo_shape fields so update docs accordingly.	2023-06-06 11:39:56 +02:00
Liam Thompson	5f09aa0d4e	[DOCS] Reverse order of approximate and exact NN search instructions (#96517 )	2023-06-02 15:41:26 +02:00
Michael Peterson	8b1cd47455	Support CCS minimize round trips in async search (#96012 ) * Support CCS minimize round trips in async search This commit makes the smallest set of changes to allow async-search based cross-cluster search to work with the CCS minimize_round_trips feature without changing the internals/architecture of the search action. When ccsMinimizeRoundtrips is set to true on SubmitAsyncSearchRequest, the AsyncSearchTask on the primary CCS coordinator sends a synchronous SearchRequest to all to clusters for a remote coordinator to orchestrate and return the entire result set to the CCS coordinator as a single response. This is the same functionality provided by synchronous CCS search using minimize_roundtrips. Since this is an async search, it means that the async search coordinator has no visibility into search progress on the remote clusters while they are running the search, thus losing one of the key features of async search. However, this is a good first approach for improving overall search latency for cross cluster searches that query a large number of shards on remote clusters, since Kibana does not currently expose incremental progress of an async search to users. Relates #73971	2023-06-01 10:34:16 -04:00
debadair	777598d602	[DOCS] Remove redirect pages (#88738 ) * [DOCS] Remove manual redirects * [DOCS] Removed refs to modules-discovery-hosts-providers * [DOCS] Fixed broken internal refs * Fixing bad cross links in ES book, and adding redirects.asciidoc[] back into docs/reference/index.asciidoc. * Update docs/reference/search/point-in-time-api.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/setup/restart-cluster.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/sql/endpoints/translate.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/snapshot-restore/restore-snapshot.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update repository-azure.asciidoc * Update node-tool.asciidoc * Update repository-azure.asciidoc --------- Co-authored-by: amyjtechwriter <61687663+amyjtechwriter@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Amy Jonsson <amy.jonsson@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2023-05-24 12:32:46 +01:00
Mayya Sharipova	d5895087ef	Docs: Knn doesn't support ccs_minimize_roundtrips (#96222 ) Add a note that approximate knn search doesn't support the parameter ccs_minimize_roundtrips in CCS search. Relates to #88694	2023-05-23 12:59:55 -04:00
Abdon Pijpelink	32d764cce7	[DOCS] Fix small typo (#96242 )	2023-05-23 15:47:43 +02:00
Abdon Pijpelink	44796f7be0	[DOCS] Update CCS compatibility matrix for 8.9 (#96277 )	2023-05-23 15:46:48 +02:00
István Zoltán Szabó	59ee140d17	[DOCS] Removes metadata tags from ELSER tutorial. (#96200 )	2023-05-17 16:25:58 +02:00
István Zoltán Szabó	e0a4edc46d	[DOCS] Adds example of semantic search with ELSER (#95992 ) Co-authored-by: David Roberts <dave.roberts@elastic.co> Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-05-15 14:18:33 +02:00
Saarika Bhasi	7d418ef61a	[Docs] Adds mustache dependency search templates (#95118 ) * Adds Mustache dependency in Search template * Adds more Mustache examples with Search template Co-authored-by: T. Scot Clausing <tsclausing@gmail.com> * Update docs/reference/search/search-your-data/search-template.asciidoc Co-authored-by: T. Scot Clausing <tsclausing@gmail.com> * Adds examples for search templates * Copy changes from PR suggestions and modify examples * Minor edits * Re-wording * Added Suggestion: re-structure docs to align with mustache manual * Fix CI checks * Feedback: Remove gradle dependency * Remove params from examples, removed pretty, minor refractoring * Minor rewording variable description --------- Co-authored-by: T. Scot Clausing <tsclausing@gmail.com> Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-05-12 10:57:52 -04:00
Jack Conradson	24c600748a	Add initial documentation for RRF (#95687 ) This is a follow up to (#93396) that adds documentation for RRF including an example with a breakdown of the RRF formula.	2023-05-11 14:51:49 -07:00
Akib Rhast	5a148d3d3f	Update term-suggest.asciidoc (#86780 ) * Update term-suggest.asciidoc It is really easy to miss the fact, that that's the default setting, since it is not highlighted or called out in anyway * Apply review suggestion --------- Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-05-09 12:23:45 +02:00
Abdon Pijpelink	f2d9a3fbca	[DOCS] Fix 'shards_size' typo (#95696 ) * [DOCS] Fix 'shards_size' typo * Second occurrence of 'shards_size'	2023-05-01 15:24:43 +02:00
Martijn van Groningen	49e8ee4269	Remove remaining tsdb tech preview labels (#95563 ) Remove tech preview label from a number of tsdb settings and mapping attributes.	2023-04-26 12:11:03 +02:00
Jack Conradson	5314e5dd55	Add support for Reciprocal Rank Fusion to the search API (#93396 ) This change at a high level adds global ranking on the coordinating node at the end of query reduction prior to the fetch phase. Individual rank methods are defined in plugins. The first rank plugin added as part of this change is reciprocal rank fusion (RRF). RRF uses a relatively simple formula for merging 1...n results sets together with sum(1/(k+d)) where k is a ranking constant and d is a document's scored position within a result set from a query.	2023-04-24 15:07:34 -07:00
Matthias Wilhelm	9fdb857010	Update field-caps.asciidoc to add information about `include_unmapped` (#94888 ) - Adding context why `include_unmapped` doesn't return results for fields that are not mapped in any index	2023-04-11 09:34:49 +02:00
István Zoltán Szabó	f350159e32	[DOCS] Creates page for ELSER semantic search docs (#95072 ) * [DOCS] Creates page for ELSER semantic search docs.	2023-04-06 13:43:16 +02:00
Ten Bradley	d5a33e439c	Update search-your-data.asciidoc (#94878 ) Minor typo `hit.hits` => `hits.hits`	2023-03-31 10:49:26 +01:00
Benjamin Trent	f23b906891	Add new `similarity` field to `knn` clause in `_search` (#94828 ) This adds a new parameter to `knn` that allows filtering nearest neighbor results that are outside a given similarity. `num_candidates` and `k` are still required as this controls the nearest-neighbor vector search accuracy and exploration. For each shard the query will search `num_candidates` and only keep those that are within the provided `similarity` boundary, and then finally reduce to only the global top `k` as normal. For example, when using the `l2_norm` indexed similarity value, this could be considered a `radius` post-filter on `knn`. relates to: https://github.com/elastic/elasticsearch/issues/84929 && https://github.com/elastic/elasticsearch/pull/93574	2023-03-28 15:29:01 -04:00
Abdon Pijpelink	4a93dba806	[DOCS] Fix figure references (#94583 )	2023-03-21 14:32:33 +01:00
Christoph Büscher	d8021360ff	Enable `_terms_enum` on `ip` fields (#94322 ) The _terms_enum API currently does not support ip fields. However, type-ahead-like completion is useful for UI purposes. This change adds the ability to query ip fields via the _terms_enum API by leveraging the terms enumeration available when doc_values are enabled on the field, which is the default. In order to make prefix filtering fast, we internally create a fast prefix automaton from the user-supplied prefix that gets intersected with the shards terms enumeration, similar to what we do for keyword fields already. Closes #89933	2023-03-07 19:26:20 +01:00
iamthinh	a4ee8f4a34	Update profile.asciidoc (#92656 ) * Update profile.asciidoc Fix small typo * Update docs/reference/search/profile.asciidoc Co-authored-by: Felix Stürmer <weltenwort@users.noreply.github.com> --------- Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co> Co-authored-by: Felix Stürmer <weltenwort@users.noreply.github.com>	2023-02-27 09:54:03 +01:00
David Roberts	ee3f51a7bb	[ML] Make text_embedding query vector builder experimental for first release (#93979 ) The text_embedding query vector builder that can be used with KNN search to deliver a semantic search solution will be experimental for its first release.	2023-02-22 09:29:23 +00:00
Christoph Büscher	edc7a6171c	Enable _terms_enum API for version fields (#93839 ) The _terms_enum API currently only supports the keyword, constant_keyword and flattened field type. This change adds support for the `version` field type that sorts according to the semantic versioning definition. Closes #83403	2023-02-21 14:03:12 +01:00
Alan Woodward	639eab0549	Remove force_source option for highlighting (#93193 ) This was only needed because the percolator uses a MemoryIndex which did not support stored fields, and so when it ran a highlighting phase it needed to force it to read from source. MemoryIndex added stored fields support in lucene 9.5, so we can remove this internal parameter. The parameter remains available, but deprecated, via the rest layer, and no longer has any effect.	2023-02-21 09:51:28 +00:00
István Zoltán Szabó	4d117c5add	[DOCS] Adds semantic search section to kNN search page (#93782 ) Co-authored-by: Abdon Pijpelink <abdon.pijpelink@elastic.co>	2023-02-15 11:44:41 +01:00
Abdon Pijpelink	1fca7f6ab9	[DOCS] Mention search_after in PIT docs (#93627 )	2023-02-10 10:40:03 +01:00
István Zoltán Szabó	c08c16e311	[DOCS] Removes semantic search reference docs (#93500 )	2023-02-06 11:00:25 +01:00

1 2 3 4 5 ...

1217 commits