elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-22 14:17:51 -04:00

Author	SHA1	Message	Date
Nik Everett	3263429a78	ESQL: Speed up VALUES for many buckets (#123073 ) (#123229 ) * ESQL: Speed up VALUES for many buckets (#123073) Speeds up the VALUES agg when collecting from many buckets. Specifically, this speeds up the algorithm used to `finish` the aggregation. Most specifically, this makes the algorithm more tollerant to large numbers of groups being collected. The old algorithm was `O(n^2)` with the number of groups. The new one is `O(n)` ``` (groups) 1 219.683 ± 1.069 -> 223.477 ± 1.990 ms/op 1000 426.323 ± 75.963 -> 463.670 ± 7.275 ms/op 100000 36690.871 ± 4656.350 -> 7800.332 ± 2775.869 ms/op 200000 89422.113 ± 2972.606 -> 21920.288 ± 3427.962 ms/op 400000 timed out at 10 minutes -> 40051.524 ± 2011.706 ms/op ``` The `1` group version was not changed at all. That's just noise in the measurement. The small bump in the `1000` case is almost certainly worth it and real. The huge drop in the `100000` case is quite real. * Fix * Compile	2025-02-27 07:35:57 +11:00
Ioana Tagirta	e40319c7a0	Remove references to doc types in percolator docs (#123508 ) (#123529 )	2025-02-27 03:26:57 +11:00
David Turner	19402e2c68	Reduce licence checks in `LicensedWriteLoadForecaster` (#123369 ) (#123408 ) Rather than checking the license (updating the usage map) on every single shard, just do it once at the start of a computation that needs to forecast write loads. Backport of #123346 to 8.x Closes #123247	2025-02-26 06:59:14 +11:00
Joe Gallo	b8f8723e6c	Register IngestGeoIpMetadata as a NamedXContent (#123079 ) (#123329 )	2025-02-25 12:53:21 +11:00
Nik Everett	e12d7775e7	ESQL: Add known issue for slow VALUES (#123222 ) Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>	2025-02-24 16:41:09 +00:00
David Turner	cc3c3870ec	Deduplicate allocation stats calls (#123267 ) (#123280 ) These things can be quite expensive and there's no need to recompute them in parallel across all management threads as done today. This commit adds a deduplicator to avoid redundant work. Backport of #123246 to `8.x`	2025-02-25 03:33:42 +11:00
Oleksandr Kolomiiets	9cc75734d0	fix stale data in synthetic source for string stored field (#123105 ) (#123277 ) Co-authored-by: jeffganmr <106223805+jeffganmr@users.noreply.github.com>	2025-02-25 03:26:32 +11:00
Johannes Fredén	33f973ba70	[8.16] Bump json-smart and oauth2-oidc-sdk (#122737 ) (#122915 ) * Bump json-smart and oauth2-oidc-sdk (#122737) * Bump json-smart and oauth2-oidc-sdk --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co> (cherry picked from commit `e16664573e`) # Conflicts: # gradle/verification-metadata.xml * fixup! Add back verification data for test dep	2025-02-19 09:54:53 +01:00
Felix Barnsteiner	bfd77c9485	Add _metric_names_hash field to OTel metric mappings (#120952 ) (#122881 ) If metrics that have the same timestamp and dimensions aren't grouped into the same document, ES will consider them to be a duplicate. The _metric_names_hash field will be set by the OTel ES exporter. As it's mapped as a time_series_dimensions, it creates a different _tsid for documents with different sets of metrics. The tradeoff is that if the composition of the metrics grouping changes over time, a different _tsid will be created. That has an impact on the rate aggregation for counters.	2025-02-19 05:40:06 +11:00
Mike Pellegrini	4d408d4591	[8.16] Fix ArrayIndexOutOfBoundsException in ShardBulkInferenceActionFilter (#122538 ) (#122854 ) * Fix ArrayIndexOutOfBoundsException in ShardBulkInferenceActionFilter (#122538) (cherry picked from commit `229d392e63`) # Conflicts: # x-pack/plugin/inference/src/internalClusterTest/java/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterIT.java * Fix compilation & test failures	2025-02-19 02:26:23 +11:00
Joe Gallo	a55e76936c	Fix redact processor arraycopy bug (#122640 ) (#122767 )	2025-02-18 03:21:45 +11:00
Johannes Fredén	4f9c33f546	Improve jwt logging on failed auth (#122247 ) (#122784 ) Update docs/changelog/122247.yaml	2025-02-18 03:18:57 +11:00
Joe Gallo	4b338f88ae	Canonicalize processor names and types in IngestStats (#122610 ) (#122633 )	2025-02-15 05:38:06 +11:00
Ignacio Vera	963f2556e9	Deduplicate IngestStats and IngestStats.Stats identity records when deserializing (#122496 ) (#122516 ) This commit makes sure we reuse the existing static instance when deserializing to avoid excessive heap usage. # Conflicts: # server/src/main/java/org/elasticsearch/ingest/IngestStats.java	2025-02-13 18:36:56 +01:00
elasticsearchmachine	c0da9daf91	Prune changelogs after 8.16.4 release	2025-02-11 20:19:19 +00:00
elasticsearchmachine	8350b129ee	Finalize release notes for v8.16.4	2025-02-12 06:00:19 +11:00
Luigi Dell'Aquila	622c3c924d	EQL: fix JOIN command validation (not supported) (#122011 ) (#122172 )	2025-02-11 01:23:37 +11:00
elasticsearchmachine	17baef4d53	Update docs for v8.16.4 release (#122106 )	2025-02-10 11:33:56 +01:00
Luigi Dell'Aquila	e1176cdfce	ES\|QL: fix ENRICH validation for use of wildcards (#121911 ) (#122020 )	2025-02-07 23:46:35 +11:00
Mark Tozzi	cf36d97a32	Aggregations cancellation after collection (#120944 ) (#121936 ) This PR addresses issues around aggregations cancellation, mentioned in https://github.com/elastic/elasticsearch/issues/108701 and other places. In brief, during aggregations collection time, we respect cancellation via the mechanisms in the searcher to poison cancelled queries. But once the aggregation finishes collection, there is no further need to interact with the searcher, so we cannot rely on that for cancellation checking. In particular, deeply nested aggregations can spend a long time constructing the results tree. Checking for cancellation is a trade off, as the check itself is somewhat expensive (it involves a volatile read), so we want to balance checking often enough that cancelled queries aren't taking up resources for a long time, but not so frequently that it slows down most aggregation queries. Our first attempt to this is to check once when we go to build sub-aggregations, as the worst cases for this that we've seen involve needing to build deep sub-aggregation trees. Checking at sub-aggregation construction time also provides a conveniently centralized method call to add the check to. --------- Conflicts: server/src/main/java/org/elasticsearch/search/aggregations/bucket/BucketsAggregator.java test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>	2025-02-07 06:51:21 +11:00
Andrei Stefan	bb77d4979e	ESQL: use field_caps native nested fields filtering (#121918 ) * [8.x] ESQL: use field_caps native nested fields filtering (#117201) (#117375) (#121645) * Just filter the nested fields natively with field_caps support (cherry picked from commit `73381dbeb1`) * Add import	2025-02-06 19:39:53 +02:00
Oleksandr Kolomiiets	28635f09d8	[8.16] Fix synthetic source issue with deeply nested ignored source fields (#121715 ) (#121790 ) * Fix synthetic source issue with deeply nested ignored source fields (#121715) * Fix synthetic source issue with deeply nested ignored source fields * Update docs/changelog/121715.yaml * fix tests	2025-02-06 07:13:24 +11:00
Joe Gallo	24c39085ca	Update geolocation database documentation (#121472 ) (#121671 )	2025-02-05 02:22:49 +11:00
Simon Cooper	9fa215a68f	[8.16] Update transport and index version id numbers to S_PP (#121380 ) (#121523 ) Backport #121380 to 8.16	2025-02-03 13:56:48 +00:00
David Turner	12a39baef2	Cheaper snapshot-related `toString()` impls (#121283 ) (#121308 ) If the `MasterService` needs to log a create-snapshot task description then it will call `CreateSnapshotTask#toString`, which today calls `RepositoryData#toString` which is not overridden so ends up calling `RepositoryData#hashCode`. This can be extraordinarily expensive in a large repository. Worse, if there's masses of create-snapshot tasks to execute then it'll do this repeatedly, because each one only ends up yielding a short hex string so we don't reach the description length limit very easily. With this commit we provide a more efficient implementation of `CreateSnapshotTask#toString` and also override `RepositoryData#toString` to protect against some other caller running into the same issue.	2025-01-31 04:09:56 +11:00
Liam Thompson	13441bc9b1	Update recovery.asciidoc (#114889 ) (#121218 ) (cherry picked from commit `d8874b6524`) Co-authored-by: Paulo <paulletilly@gmail.com>	2025-01-30 04:45:20 +11:00
Liam Thompson	7e736e0def	[DOCS] Update getting-started.asciidoc (#116151 ) (#121172 ) Update `new_field` to `language` which is the actual new field added in dynamic mapping Co-authored-by: Ekwinder <ekwindersaini@gmail.com>	2025-01-30 00:51:21 +11:00
Valeriy Khakhutskyy	1538e0d29e	Extend documentation note. (#121146 ) (#121160 )	2025-01-29 23:30:26 +11:00
István Zoltán Szabó	a201f549d2	[8.16] [DOCS] Documents that deployment_id can be used as inference_id in certain cases. (#121055 ) (#121072 ) * [DOCS] Resolves conflict. * Apply suggestions from code review	2025-01-28 21:56:22 +01:00
István Zoltán Szabó	cfddc26697	[DOCS] Resolves conflict. (#121069 )	2025-01-28 21:07:05 +01:00
George Wallace	e7be978b3a	Adjusted alias doc for clarity (#120437 ) (#121063 ) Co-authored-by: Kofi B <kofi.bartlett@elastic.co> Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>	2025-01-29 03:51:38 +11:00
Panagiotis Bailis	c5a57fc690	[8.16] backporting fix for negative scores in text_similarity_ranker retriever (#121056 )	2025-01-28 18:30:16 +02:00
Carlos Delgado	97c4bdca28	Fix incorrect use of "updateable" flag in synonyms documentation (#120866 ) (#121044 ) Co-authored-by: Amine GANI <gani.amine@gmail.com> Co-authored-by: Amine GANI <amine.gani@adelean.com>	2025-01-29 02:07:26 +11:00
Charlotte Hoblik	5d00b7e8cc	Fix typo in tutorial (#120928 ) (#121040 )	2025-01-29 01:36:28 +11:00
Liam Thompson	80039d6d25	Update match-phrase-query.asciidoc (#118828 ) (#121035 ) (cherry picked from commit `8e9cccba6a`) Co-authored-by: Damien RENIER <153135842+damien-renier-elastic@users.noreply.github.com>	2025-01-29 01:10:09 +11:00
Liam Thompson	abad04d97a	Update README.asciidoc (#96455 ) (#121027 ) Co-authored-by: ARPIT SHARMA <93235104+ARPIT2128@users.noreply.github.com>	2025-01-28 15:01:01 +01:00
Pius Fung	e1c635b336	Add warning on scripted metric aggregation's intermediate state memory usage (#119379 ) (#121003 )	2025-01-28 21:39:26 +11:00
Sean Story	46361e4d70	Clarify need to submit for authorization (#119460 ) (#121002 )	2025-01-28 21:34:12 +11:00
Maxim Kholod	7fbe99db8a	Update index-templates.asciidoc (#113461 ) (#120893 ) Adding `security_solution--` in list of index nae to avoid the pattern collisions. (cherry picked from commit `0638d3977a`) Co-authored-by: Smriti <152067238+smriti0321@users.noreply.github.com>	2025-01-27 12:30:07 +01:00
Aurélien FOUCRET	12ea3b2f64	[8.16] LTR - Fix explain failure when index has multiple shards (#120717 ) (#120794 ) * LTR - Fix explain failure when index has multiple shards (#120717) * Fix test failing in 8.x branch.	2025-01-24 23:21:43 +01:00
Aurélien FOUCRET	149fbf215f	LTR sometines throw NullPointerException: Cannot read field "approximation" because "top" is null (#120809 ) (#120827 ) * Add check on the DisiPriorityQueue size. * Update docs/changelog/120809.yaml * Add a unit test.	2025-01-25 06:15:42 +11:00
Niels Bauman	8adafb01d7	[8.16] Improve memory aspects of enrich cache (#120256 ) (#120762 ) * Improve memory aspects of enrich cache (#120256) This commit reduces the occupied heap space of the enrich cache and corrects inaccuracies in tracking the occupied heap space (for cache size limitation purposes). --------- Co-authored-by: Joe Gallo <joegallo@gmail.com> * Fix compilation --------- Co-authored-by: Joe Gallo <joegallo@gmail.com>	2025-01-24 16:18:14 +11:00
Liam Thompson	8f58b770c3	Removes outdated admonition (#120556 ) (#120705 ) Resolves /security-docs/https://github.com/elastic/security-docs/issues/6430. Removes an outdated admonition. (cherry picked from commit `63074d8e70`) Co-authored-by: Benjamin Ironside Goldstein <91905639+benironside@users.noreply.github.com>	2025-01-23 23:42:35 +11:00
Marci W	ce90795b2d	[DOCS] Count API: clarify ways to specify search query (#120564 ) (#120681 ) * Clarify query methods; other sprucing * Apply suggestions from review	2025-01-23 10:31:10 +11:00
Andrei Stefan	faeeb31822	Update search-across-clusters.asciidoc to reflect the `true` default value of `skip_unavailable` setting. (#120592 ) (#120634 )	2025-01-23 01:36:51 +11:00
Felix Barnsteiner	ae7ae7b9e4	Map scope.name as a dimension (#120590 ) (#120615 )	2025-01-23 00:00:12 +11:00
elasticsearchmachine	24ff286e59	Finalize release notes for v8.16.3	2025-01-22 22:30:08 +11:00
elasticsearchmachine	943c61e335	Prune changelogs after 8.16.3 release	2025-01-21 16:32:28 +00:00
István Zoltán Szabó	2f1c2f82d4	[8.16] [DOCS] Rename inference services to inference integrations in docs (#120517 ) Co-authored-by: David Kyle <david.kyle@elastic.co>	2025-01-21 12:31:31 +01:00
Liam Thompson	ee18ffe583	[DOCS] Updated wording for clarity for new users (#120257 ) (#120506 ) Co-authored-by: Kofi B <kofi.bartlett@elastic.co>	2025-01-21 20:30:37 +11:00

1 2 3 4 5 ...

17076 commits