elasticsearch

mirror of https://github.com/elastic/elasticsearch.git synced 2025-04-19 04:45:07 -04:00

Author	SHA1	Message	Date
Craig Taverner	a7d1bd8938	Refine .gitattributes to hide generated docs changes (#124742 )	2025-03-13 15:32:50 +01:00
Nik Everett	dc4fa26174	Speed up COALESCE significantly (#120139 ) ``` before after (operation) Score Error Score Error Units coalesce_2_noop 75.949 ± 3.961 -> 0.010 ± 0.001 ns/op 99.9% coalesce_2_eager 99.299 ± 6.959 -> 4.292 ± 0.227 ns/op 95.7% coalesce_2_lazy 113.118 ± 5.747 -> 26.746 ± 0.954 ns/op 76.4% ``` We tend to advise folks that "COALESCE is faster than CASE", but, as of 8.16.0/https://github.com/elastic/elasticsearch/pull/112295 that wasn't the true. I was working with someone a few days ago to port a scripted_metric aggregation to ESQL and we saw COALESCE taking ~60% of the time. That won't do. The trouble is that CASE and COALESCE have to be lazy, meaning that operations like: ``` COALESCE(a, 1 / b) ``` should never emit a warning if `a` is not `null`, even if `b` is `0`. In 8.16/https://github.com/elastic/elasticsearch/pull/112295 CASE grew an optimization where it could operate non-lazily if it was flagged as "safe". This brings a similar optimization to COALESCE, see it above as "case_2_eager", a 95.7% improvement. It also brings and arguably more important optimization - entire-block execution for COALESCE. The schort version is that, if the first parameter of COALESCE returns no nulls we can return it without doing anything lazily. There are a few more cases, but the upshot is that COALESCE is pretty much free in cases where long strings of results are `null` or not `null`. That's the `coalesce_2_noop` line. Finally, when there mixed null and non-null values we were using a single builder with some fairly inefficient paths. This specializes them per type and skips some slow null-checking where possible. That's the `coalesce_2_lazy` result, a more modest 76.4%. NOTE: These %s of improvements on COALESCE itself, or COALESCE with some load-overhead operators like `+`. If COALESCE isn't taking a ton time in your query don't get particularly excited about this. It's fun though. Closes #119953	2025-01-23 17:40:09 +00:00
Iván Cea Fontenla	2233349f76	ESQL: top_list aggregation (#109386 ) Added `top_list(<field>, <limit>, <order>)` aggregation, that collect top N values per bucket. Works with the same types as MAX/MIN. - Added the aggregation function - Added a template to generate the aggregators - Added a template to generate the `<Type>BucketedSort` implementations per-type - This structure is based on the `BucketedSort` structure used on the original aggregations. It was modified to better fit the ESQL ecosystem (Blocks based, no docs...) Also added a guide to create aggregations. Fixes https://github.com/elastic/elasticsearch/issues/109213	2024-06-20 00:48:45 +10:00
Iván Cea Fontenla	f16f71e2a2	ESQL: Add ip_prefix function (#109070 ) Added ESQL function to get the prefix of an IP. It works now with both IPv4 and IPv6. For users planning to use it with mixed IPs, we may need to add a function like "is_ipv4()" first. About the skipped test: There's currently a "bug" in the evaluators//functions that return null. Evaluators can't handle them. We'll work on support for that in another PR. It affects other functions, like `substring()`. In this function, however, it only affects in "wrong" cases (Like an invalid prefix), so it has no impact. Fixes https://github.com/elastic/elasticsearch/issues/99064	2024-05-29 10:23:45 -04:00
Nik Everett	e4cb2c9f6d	ESQL: Add parsing for a LOOKUP command (#109040 ) This command will serve as a sort of "inline" enrich. This commit itself is mostly antlr generated code and paranoid tests that the new `LOOKUP` keyword doesn't clash with any variables named `lookup`. I've also marked our ANTLR generated files as `linguist-generated` which causes them to be hidden by default in github's UI. You can still click a button to see them if you like. See https://docs.github.com/en/repositories/working-with-files/managing-files/customizing-how-changed-files-appear-on-github	2024-05-28 13:32:30 -04:00
Rory Hunter	d6912ebd59	Assert no carriage returns in release notes test samples (#77238 ) The expected output files for the generated changelogs should not contain carriage returns (`\r`). Their presence was causing test failures on Windows. Fix by setting the EOL character via `.gitattributes`	2021-09-07 20:45:23 +01:00
Paul Sanwald	3e7fccddaf	Add a CHANGELOG file for release notes. (#29450 ) * Add a CHANGELOG file for 7.x release notes. * update file to include 6.x * remove confusing comment and small edit to section title * moving CHANGELOG file under docs directory, as it pertains to release notes.	2018-04-18 07:42:05 -07:00

7 commits