Commit graph

12208 commits

Author SHA1 Message Date
Panagiotis Bailis
ad83d9b35d
Updating retriever-examples documentation to run validation tests on the provided snippets (#116643) 2024-11-29 12:50:01 +00:00
Liam Thompson
ab604ada78
[DOCS] Update tutorial example (#117538) 2024-11-28 16:34:57 +01:00
Martijn van Groningen
6a4b68d263
Add source mode stats to MappingStats (#117463) 2024-11-28 10:53:39 +01:00
kosabogi
79d70686b3
Fixes typo (#117684) 2024-11-28 09:26:16 +01:00
Liam Thompson
c3ac2bd58a
[DOCS] Add Elastic Rerank usage docs (#117625) 2024-11-28 08:23:28 +01:00
Nik Everett
9022cccba7
ESQL: CATEGORIZE as a BlockHash (#114317)
Re-implement `CATEGORIZE` in a way that works for multi-node clusters.

This requires that data is first categorized on each data node in a first pass, then the categorizers from each data node are merged on the coordinator node and previously categorized rows are re-categorized.

BlockHashes, used in HashAggregations, already work in a very similar way. E.g. for queries like `... | STATS ... BY field1, field2` they map values for `field1` and `field2` to unique integer ids that are then passed to the actual aggregate functions to identify which "bucket" a row belongs to. When passed from the data nodes to the coordinator, the BlockHashes are also merged to obtain unique ids for every value in `field1, field2` that is seen on the coordinator (not only on the local data nodes).

Therefore, we re-implement `CATEGORIZE` as a special BlockHash.

To choose the correct BlockHash when a query plan is mapped to physical operations, the `AggregateExec` query plan node needs to know that we will be categorizing the field `message` in a query containing `... | STATS ... BY c = CATEGORIZE(message)`. For this reason, _we do not extract the expression_ `c = CATEGORIZE(message)` into an `EVAL` node, in contrast to e.g. `STATS ... BY b = BUCKET(field, 10)`. The expression `c = CATEGORIZE(message)` simply remains inside the `AggregateExec`'s groupings.

**Important limitation:** For now, to use `CATEGORIZE` in a `STATS` command, there can be only 1 grouping (the `CATEGORIZE`) overall.
2024-11-27 17:44:55 +01:00
George Wallace
9e61089414
[DOCS] : swap allocation sections (#116518)
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-11-27 11:39:07 +01:00
Oleksandr Kolomiiets
f57c43cdf5
Include a link to downsampling a TSDS using DSL document (#117510) 2024-11-26 08:09:30 -08:00
Jedr Blaszyk
5e028220c9
[Docs] Update incremental sync note (#117545) 2024-11-26 11:06:52 +00:00
Liam Thompson
a860d3ab33
[DOCS] Trivial: remove tech preview badge (#117461) 2024-11-26 10:48:35 +01:00
Benjamin Trent
374c88a832
Correct bit * byte and bit * float script comparisons (#117404)
I goofed on the bit * byte and bit * float comparisons. Naturally, these
should be bigendian and compare the dimensions with the binary ones
appropriately.

Additionally, I added a test to ensure that this is handled correctly.
2024-11-26 03:38:06 +11:00
Craig Taverner
8c22fc479f
Make spatial search functions not preview (#117489) 2024-11-25 17:04:48 +01:00
padmaprasath21
b7d801809f
Update tsds-reindex.asciidoc (#117446) 2024-11-25 07:56:17 -08:00
florent-leborgne
fa9f2bff0e
Docs for starred esql queries in Kibana (#117468) 2024-11-25 15:13:23 +01:00
Philippus Baalman
fd6e8857bc
Mention bbq_hnsw for m and ef_construction options in docs (#117022) 2024-11-25 14:50:09 +01:00
Aurélien FOUCRET
ff58d891a1
ES|QL kql function. (#116764) 2024-11-25 14:22:11 +01:00
István Zoltán Szabó
339e431081
[DOCS] Documents that ELSER is the default service for semantic_text (#115769) 2024-11-25 08:07:30 -05:00
Luke Whiting
1d4c8d85f6
(#34659) - Add Timezone Configuration to Watcher (#117033)
* Add timezone support to Cron objects

* Add timezone support to CronnableSchedule

* XContent change to support parsing and display of TimeZone fields on schedules

* Case insensitive timezone parsing

* Doc changes

* YAML REST tests

* Equals, toString and HashCode now include timezone

* Additional random testing for DST transitions

* Migrate Cron class to use wrapped LocalDateTime

The algorithm depends on some quirks of calendar but LocalDateTime
correctly ignores DST during calculations so this uses a LocalDateTime
with a wrapper to emulate some of Calendar's behaviours that the Cron
algorithm depends on

* Additional documentation to explain discontinuity event behaviour

* Remove redundant conversions from ZoneId to TimeZone following move to LocalDateTime

* Add documentation warning that manual clock changes will cause unpredictable watch execution

* Update docs/reference/watcher/trigger/schedule.asciidoc

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

---------

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
2024-11-25 09:51:11 +00:00
Larisa Motova
7e801e0410
[ES|QL] Add a standard deviation function (#116531)
Uses Welford's online algorithm, as well as the parallel version, to
calculate standard deviation.
2024-11-22 12:33:46 -10:00
Nik Everett
4ecc7518ef
ESQL: Add docs for MV_PERCENTILE (#117377)
We built this a while back. Let's document it.
2024-11-23 06:41:18 +11:00
Nik Everett
893dfd3c9a
ESQL: Make WEIGHTED_AVG not preview (#117356)
It's not PREVIEW.
2024-11-22 16:28:06 +00:00
Bogdan Pintea
1fe3ed1e85
Add docs for aggs filtering (#116681)
Add documentation for aggs filtering (the WHERE in STATS command).

Fixes: #115083
2024-11-22 13:26:30 +01:00
Luigi Dell'Aquila
a1247b3e60
ES|QL: fix validation of SORT by aggregate functions (#117316) 2024-11-22 12:12:09 +01:00
Slobodan Adamović
6ea3e01958
Upgrade Bouncy Castle FIPS dependencies (#112989)
This PR updates `bc-fips` and `bctls-fips` dependencies to the latest
minor versions.
2024-11-22 21:39:25 +11:00
Lisa Cawley
8fe8d22f7c
[DOCS] Remove broken migration guide link (#117293) 2024-11-21 14:02:18 -08:00
elasticsearchmachine
b378a1bb54 Bump 8.x to 8.18.0 2024-11-21 14:37:05 -05:00
Carlos Delgado
ea4b41fca8
ESQL - match operator included in non-snapshot builds (#116819) 2024-11-21 07:45:22 +01:00
Mark Tozzi
c3f73d0319
Esql Enable Date Nanos (#117080)
This enables date nanos support as tech preview. Basic operations, like reading values, binary comparisons, and functions that don't care about type should work, but some functions are not yet supported. Most notably, Bucket is not yet supported, although Date_Trunc is and can be used for grouping. See the docs for the full list of limitations.

relates to #109352
2024-11-20 09:31:01 -05:00
Costin Leau
bc785f5ca1
Esql/lookup join grammar (#116515)
First PR for adding LOOKUP JOIN in ESQL.
Introduces grammar and wires the main building blocks to execute a query; follow-ups are required (see #116208 for more details).

Co-authored-by: Nik Everett <nik9000@users.noreply.github.com>
2024-11-19 17:52:24 -08:00
Stef Nestor
72c44595f4
(Doc+) link videos for allocation and ilm (#116880)
* (Doc+) link videos for allocation and ilm

---------

Co-authored-by: shainaraskas <58563081+shainaraskas@users.noreply.github.com>
2024-11-19 14:43:50 -07:00
Liam Thompson
c699af2c67
[DOCS] Rename how-to subsection, move recipes to search relevance (#117044) 2024-11-19 18:27:05 +01:00
Craig Taverner
f3cd48209e
Added stricter range type checks and runtime warnings for ENRICH (#115091)
It has been noted that strange or incorrect error messages are returned if the ENRICH command uses incompatible data types, for example a KEYWORD with value 'foo' using in an int_range match: https://github.com/elastic/elasticsearch/issues/107357

This error is thrown at runtime and contradicts the ES|QL policy of only throwing errors at planning time, while at runtime we should instead set results to null and add a warning. However, we could make the planner stricter and block potentially mismatching types earlier.

However runtime parsing of KEYWORD fields has been a feature of ES|QL ENRICH since it's inception, in particular we even have tests asserting that KEYWORD fields containing parsable IP data can be joined to an ip_range ENRICH index.

In order to not create a backwards compatibility problem, we have compromised with the following:

* Strict range type checking at the planner time for incompatible range types, unless the incoming index field is KEYWORD
* For KEYWORD fields, allow runtime parsing of the fields, but when parsing fails, set the result to null and add a warning

Added extra tests to verify behaviour of match policies on non-keyword fields. They all behave as keywords (the enrich field is converted to keyword at policy execution time, and the input data is converted to keyword at lookup time).
2024-11-19 16:34:21 +01:00
Simon Cooper
b30a4b23f2
Output a consistent format when generating error json (#90529)
Now, error fields will always have 'type' and 'reason' fields, and the information in those fields is the same regardless of whether the output is detailed or not
2024-11-19 13:35:04 +00:00
Fang Xing
d33bff6468
[ES|QL][DOCS] Add docs for date_period and time_duration (#116368)
* add docs for date_period and time_duration
2024-11-19 07:48:35 -05:00
Bogdan Pintea
b5addca40a
ESQL: Docs: COUNT: add an explanation to the use of the 3VL (#116684)
Add an explanation of why `... OR NULL` is needed with `COUNT(...)`.

Fixes: #99954
2024-11-19 10:37:47 +01:00
Jason Tu
efc3ba9958
Update indexing-speed.asciidoc (#116559) 2024-11-18 13:17:17 -05:00
Peter Straßer
c804953105
Provide access to new settings for HyphenationCompoundWordTokenFilter (#115585)
Allow the new flags added in Lucene in the HyphenationCompoundWordTokenFilter

Adds access to the two new flags no_sub_matches and no_overlapping_matches.

Lucene issue: https://github.com/apache/lucene/issues/9231
2024-11-18 17:38:49 +01:00
Luca Cavanna
99689281e0
Remove support for deprecated force_source highlighting parameter (#116943)
force_source is being parsed as a no-op since 8.8. This commit removes support
for it at REST, meaning a search request that provides it gets now an error back.
2024-11-18 17:36:39 +01:00
Cauê Marcondes
e019fc03e0
Remove apm_user role (#116712)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-11-18 15:28:30 +00:00
Liam Thompson
4e17c61d39
[DOCS] Remove 'rescore' from retriever.asciidoc (#116921) 2024-11-18 11:34:28 +01:00
Sean Story
f55e5d020b
Note a limitation in Basic Sync Rules (#116859) 2024-11-18 09:33:03 +01:00
shainaraskas
2d2ad00872
fix formatting errors (#116843) 2024-11-14 15:45:16 -05:00
Liam Thompson
a193fc34a3
[Docs] Link to ECK Azure snapshot docs (#111586) 2024-11-14 18:39:38 +01:00
Brendan Cully
b77df851b1
Add warning about restart migration (#116769)
We have gotten more than one SDH due to customers not understanding
why restarts involving fully-mounted indices can pull a lot of data
from the snapshot tier, so it may help to be more explicit about
why this happens and how it can be avoided.
2024-11-14 18:07:09 +01:00
Gal Lalouche
c45977a5fd
[ESQL] Update docs format (missing space before '=') (#116808) 2024-11-14 16:05:28 +02:00
Luke Whiting
2f26ec2351
Introduce Email Address Allow Lists For Watcher (#116672)
* New setting plus mutual exclusiveness validation

* New domain list checking

* Email service tests

* Documentation updates

* PR Changes

Fix comment
2024-11-14 12:38:14 +01:00
Gal Lalouche
591cd591ad
[ES|QL] Update length docs (#116734)
ESQL Update length docs (#116734)
2024-11-14 13:14:43 +02:00
Fang Xing
b37a829efa
[ES|QL] Implicit casting string literal to intervals in EsqlScalarFunction and GroupingFunction (#115814)
* implicit casting from string literals to datetime intervals
2024-11-13 18:25:06 -05:00
Kathleen DeRusso
1b03a96e52
Add tracking for query rule types (#116357)
* Add total rule type counts to list calls and xpack usage

* Add feature

* Update docs/changelog/116357.yaml

* Fix docs test failure & update yaml tests

* remove additional spaces

---------

Co-authored-by: Mark J. Hoy <mark.hoy@elastic.co>
2024-11-13 17:05:05 +01:00
Max Hniebergall
d1788af03f
Update service-elser.asciidoc (#116272) 2024-11-13 08:42:07 -05:00