elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-06-28 17:34:17 -04:00

History

Luca Cavanna f5a2af6c71 Query phase: fold collector wrappers into a single top level collector (#97030 ) The query phase uses a number of different collectors and combines them together, pretty much one per feature that the search API exposes: there is a collector for post_filter, one for min_score, one for terminate_after, one for aggs. While this is very flexible, we always combine such collectors together in the same way (e.g. terminate_after must be the first one, post_filter is only applied to top docs collection, min score is applied to both aggs and top docs). This means that despite we could flexibly compose collectors, we need to apply each feature predictably which makes the composability not needed. Furthermore, composability causes complexity. The terminate_after functionality is a clear example of complexity introduced as a consequence of having a complex collector tree: it relies on a multi collector, and throws an exception to force terminating the collection for all other collectors in the tree. If there was a single collector aware of post_filter, min_score and terminate_after at the same time, we could simply reuse Lucene mechanisms to early terminate the collection (CollectionTerminatedException) instead of forcing the termination throwing an exception that Lucene does not handle. Furthermore, MultiCollector is a complex and generic collector to combine multiple collectors together, while we always every combine maximum two collectors with it, which are more or less fixed (e.g. top docs and aggs). This PR introduces a new top-level collector that is inspired by MultiCollector in that it holds the top docs and the optional aggs collector and applies post_filter, min_score as well as terminate_after as part of its execution. This allows us to have a specialized collector for our needs, less flexibility and more control. This surfaced some strange behaviour that we may want to change as a follow-up in how terminate_after makes us collecting docs even when all possible collections have been early terminated. The goal of this PR though is to have feature parity with query phase before the refactoring, without any change of behaviour. A nice benefit of this work is that it allows us to rely on CollectionTerminatedException for the terminate_after functionality. This simplifies the introduction of multi-threaded collector managers when it comes to handling exceptions.		2023-06-30 12:48:13 +02:00
..
aggregations	Update histogram-aggregation docs (#96974 )	2023-06-22 11:16:39 +02:00
analysis	Add `trim` filter to allowed normalizer filters in docs (#96739 )	2023-06-14 15:52:26 +02:00
autoscaling	[DOCS] Updates ML decider docs by mentioning CPU as scaling criterion (#92018 )	2022-11-30 13:37:20 +01:00
behavioral-analytics/apis	Add beta label to Behavioral Analytics API reference (#96657 )	2023-06-07 14:45:36 +02:00
cat	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
ccr	[DOCS] CCR disaster recovery (#91491 )	2023-04-21 10:02:54 +01:00
cluster	Fix delete-desired-balance doc (#96978 )	2023-06-27 10:12:15 +02:00
commands	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
data-management	[DOC] auto migrate only for default template (#82043 )	2022-05-10 11:35:19 -04:00
data-streams	Start with data stream lifecycle documentation (#95326 )	2023-06-28 16:18:05 +03:00
docs	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
eql	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
features/apis	Fix typo (#91894 )	2022-11-24 14:40:43 +01:00
fleet	Fix some typos in plugins & reference docs (#84667 )	2022-03-07 12:29:58 -05:00
graph	Fix typo in Graph Explore API docs (#95907 )	2023-05-08 15:38:35 +02:00
health	Document the enhancements to ILM Health Indicator (#96980 )	2023-06-27 10:54:36 +02:00
high-availability	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
how-to	Add file extensions for vector search for preload (#96955 )	2023-06-20 13:52:51 -04:00
ilm	Start with data stream lifecycle documentation (#95326 )	2023-06-28 16:18:05 +03:00
images	Add Geospatial analysis overview documentation (#94486 )	2023-03-20 10:01:13 -06:00
index-modules	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
indices	Fix format of DiscoveryNode xcontent index version fields (#97223 )	2023-06-29 16:53:38 +01:00
ingest	Enable analytics geoip in behavioral analytics. (#96624 )	2023-06-15 23:42:10 +02:00
licensing	[DOCS] Remove `testenv` annotations from doc snippet tests (#80023 )	2021-11-05 18:38:50 -04:00
mapping	[DOCS] Make 2028 dims 'experimental' warning inline (#96369 )	2023-05-30 10:13:38 +02:00
migration	Migrate IndexMetadata.getCreationVersion to IndexVersion (#97139 )	2023-06-29 08:38:50 +01:00
ml	[DOCS] Adds API docs for bert_ja text embedding tokenizer option (#96873 )	2023-06-26 11:36:08 +02:00
modules	[DOCS] Note license requirements for CCS (#97252 )	2023-06-29 16:55:10 -04:00
monitoring	[DOCS] Update default monitoring method on Elastic Cloud (#95662 )	2023-05-02 11:31:33 +02:00
query-dsl	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
release-notes	Bump to version 8.10.0	2023-06-22 10:35:12 +01:00
repositories-metering-api	[DOCS] Remove `testenv` annotations from doc snippet tests (#80023 )	2021-11-05 18:38:50 -04:00
rest-api	Start with data stream lifecycle documentation (#95326 )	2023-06-28 16:18:05 +03:00
rollup	[DOCS] Add downsampling reference to rollup docs (#91295 )	2022-11-08 10:02:17 -05:00
scripting	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
search	Query phase: fold collector wrappers into a single top level collector (#97030 )	2023-06-30 12:48:13 +02:00
search-application/apis	Update Search Application API docs to discuss warnings (#97188 )	2023-06-29 09:16:07 -04:00
searchable-snapshots	Clarify searchable snapshot repository reliability (#93023 )	2023-01-19 14:31:01 +02:00
settings	Start with data stream lifecycle documentation (#95326 )	2023-06-28 16:18:05 +03:00
setup	Add transport version to main response (#96900 )	2023-06-20 16:36:04 -04:00
shutdown/apis	[DOCS] Fix typo in shutdown-put.asciidoc (#94234 )	2023-03-01 15:31:23 +01:00
slm/apis	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
snapshot-restore	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
sql	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
tab-widgets	Add shards capacity troubleshooting guide (#95208 )	2023-04-19 09:24:07 +02:00
text-structure/apis	[ML] Unmute text-structure docs test (#92224 )	2022-12-08 09:19:41 +00:00
transform	[DOCS] Fixes transform scheduled_now documentation (#96766 )	2023-06-12 16:07:30 +02:00
troubleshooting	Suggest capturing a heap dump to diagnose high heap (#96526 )	2023-06-02 09:43:52 -04:00
upgrade	Docs for snapshots as simple archives (#86261 )	2022-05-30 13:23:53 +02:00
vectors	[DOCS] Warn about calling vector functions repeatedly (#91864 )	2022-12-12 09:43:46 +01:00
aggregations.asciidoc	Convert bucket aggs docs to runtime fields (#71202 )	2021-04-02 12:12:06 -04:00
alias.asciidoc	[DOCS] Explain how to change aliases in data streams documentation (#94110 )	2023-03-21 15:34:00 +01:00
analysis.asciidoc	Update Lucene analysis base url (#84094 )	2022-02-17 12:44:12 +01:00
api-conventions.asciidoc	Fix a typo in api-conventions example (#88056 )	2022-06-27 13:58:51 -04:00
cat.asciidoc	[DOCS] Add documentation for cat component templates (#95035 )	2023-04-05 16:51:11 +02:00
cluster.asciidoc	Generalise new cluster info endpoint (#96259 )	2023-05-23 16:30:56 +02:00
data-management.asciidoc	Start with data stream lifecycle documentation (#95326 )	2023-06-28 16:18:05 +03:00
data-rollup-transform.asciidoc	[DOCS] Remove ifdefs for rollup refactor	2021-08-05 09:08:04 -04:00
datatiers.asciidoc	[+Doc] Troubleshooting / Hot Spotting (#95429 )	2023-04-26 12:29:47 -06:00
dependencies-versions.asciidoc	[DOCS] Replace dependencies list with a link. Closes #84863 (#90694 )	2022-11-09 14:37:55 -08:00
docs.asciidoc	[DOCS] Update single index APIs reference (#73103 )	2021-05-14 11:53:34 -04:00
geospatial-analysis.asciidoc	Add Geospatial analysis overview documentation (#94486 )	2023-03-20 10:01:13 -06:00
gs-index.asciidoc
high-availability.asciidoc	[DOCS] Overhaul snapshot and restore docs (#79081 )	2021-11-15 12:45:07 -05:00
how-to.asciidoc	Add guide for tuning kNN search (#89782 )	2022-10-12 14:53:53 -07:00
index-custom-title-page.html	Add Geospatial analysis overview documentation (#94486 )	2023-03-20 10:01:13 -06:00
index-modules.asciidoc	Trigger refresh when shard becomes search active (#96321 )	2023-06-15 07:25:37 +02:00
index.asciidoc	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
index.x.asciidoc
indices.asciidoc	[DOCS] Add Downsampling docs (#88571 )	2022-10-12 12:10:16 -04:00
ingest.asciidoc	[DOCS] Remove redirect pages (#88738 )	2023-05-24 12:32:46 +01:00
intro.asciidoc	[DOCS] Update ES intro for stretched clusters (#77651 )	2021-09-13 16:50:08 -04:00
links.asciidoc	[DOCS] Rename ES Reference to ES Guide (#71198 )	2021-04-01 15:38:41 -04:00
mapping.asciidoc	Minor revision missed in merge. (#67282 )	2021-01-11 13:50:06 -05:00
query-dsl.asciidoc	[DOCS] Adds reference documentation to the text expansion query (#96151 )	2023-05-17 09:39:23 +02:00
redirects.asciidoc	Start with data stream lifecycle documentation (#95326 )	2023-06-28 16:18:05 +03:00
release-notes.asciidoc	Bump to version 8.10.0	2023-06-22 10:35:12 +01:00
scripting.asciidoc	[DOCS] Add documentation for Painless field API (#83388 )	2022-02-03 15:15:38 -05:00
search.asciidoc	Add support for Reciprocal Rank Fusion to the search API (#93396 )	2023-04-24 15:07:34 -07:00
setup.asciidoc	Start with data stream lifecycle documentation (#95326 )	2023-06-28 16:18:05 +03:00
troubleshooting.asciidoc	Add note on jstack frequency for troubleshooting (#95764 )	2023-05-03 10:04:13 +01:00
upgrade.asciidoc	Reinstate prerelease upgrade warning (#90093 )	2022-09-16 00:06:08 +09:30