Commit graph

3409 commits

Author SHA1 Message Date
ChrisHegarty
bdcdd2e8b9 Merge branch 'main' into lucene_snapshot 2024-05-07 12:50:34 +01:00
Moritz Mack
b71fc0c561
Migrate remaining usage of skip version in YAML specs to cluster_features (#108055) 2024-05-07 09:42:17 +02:00
elasticsearchmachine
e42f38c5c7 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-02 10:01:20 +00:00
David Kyle
53252c67b3
[ML] Remove mention of models in inference actions (#107704)
Renames the GET, PUT and DELETE inference APIs removing the model parts:
inference.delete_model -> inference.delete
inference.get_model -> inference.get
inference.put -> inference.put
The GET response now has a endpoints field instead of models
2024-05-01 15:46:50 +01:00
elasticsearchmachine
2f6283d181 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-01 10:02:01 +00:00
Kostas Krikellas
098f1fd960
Track synthetic source for disabled objects (#108051)
* Track synthetic source for disabled objects

* Update docs/changelog/108051.yaml

* minor refactor

* remove redundant method

* no need to skipChildren

* add mixed test

* add mixed yaml test

* remove extra line
2024-05-01 08:59:28 +03:00
Oleksandr Kolomiiets
a810f87235
Fix off by one error when handling null values in range fields (#107977) 2024-04-30 13:39:37 -07:00
ChrisHegarty
5596151e18 Merge branch 'main' into lucene_snapshot 2024-04-30 12:04:17 +01:00
Ioana Tagirta
e784706b4e
Allow typed_keys for search application Search API (#108007)
* Allow typed_keys for search application Search API

* Update docs/changelog/108007.yaml

* Use RestSearchAction.TYPED_KEYS_PARAM
2024-04-30 11:31:45 +02:00
Moritz Mack
f4fac1e545
Update README for YAML Rest API specs (#107837) 2024-04-30 10:53:53 +02:00
eyalkoren
ee262954ee
Adding aggregations support for the _ignored field (#101373)
Enables aggregations on the _ignored metadata field replacing the stored field
with doc values.
2024-04-29 16:41:34 +02:00
elasticsearchmachine
3b835a608f Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-29 10:02:03 +00:00
Kostas Krikellas
68b5336e75
[TEST] restore synthetic source yaml test (#107991)
* [TEST] simplify synthetic source yaml test

* [TEST] restore synthetic source yaml test
2024-04-29 11:16:07 +03:00
Jim Ferenczi
4380cd1bd5
Allow rescorer with field collapsing (#107779)
This change adds the support for rescoring collapsed documents.
The rescoring is applied on the top document per group on each shard.

Closes #27243
2024-04-29 08:48:12 +01:00
elasticsearchmachine
1799b7e426 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-27 10:02:13 +00:00
Nhat Nguyen
01cc967ce9
Mute synthetic source YAML tests (#107958)
Relates #107567
2024-04-26 09:34:11 -07:00
Kostas Krikellas
0c41cb7e71
[TEST] simplify synthetic source yaml test (#107949) 2024-04-26 16:24:20 +03:00
Kostas Krikellas
3183e6d6c9
Add ignored field values to synthetic source (#107567)
* Add ignored field values to synthetic source

* Update docs/changelog/107567.yaml

* initialize map

* yaml fix

* add node feature

* add comments

* small fixes

* missing cluster feature in yaml

* constants for chars, stored fields

* remove duplicate method

* throw exception on parse failure

* remove Base64 encoding

* add assert on IgnoredValuesFieldMapper::write

* changes from review

* simplify logic

* add comment

* rename classes

* rename _ignored_values to _ignored_source

* rename _ignored_values to _ignored_source
2024-04-26 15:35:31 +03:00
elasticsearchmachine
18361d78c1 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-26 10:01:26 +00:00
Oleksandr Kolomiiets
4ef8b3825e
Revert "Format default values of IP ranges to match other range bound" (#107910)
This reverts commit acb5139 (a part of #107081). This commit impacts search behaviour for IP range fields and so needs to be reverted.
2024-04-25 10:27:32 -07:00
elasticsearchmachine
218f99fc2a Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-25 10:02:28 +00:00
Oleksandr Kolomiiets
ee566e4f13
[TEST] Use GET API instead of search in range field synthetic source tests (#107874) 2024-04-24 14:08:21 -07:00
Oleksandr Kolomiiets
cde894a5ce
Implement synthetic source support for range fields (#107081)
* Implement synthetic source support for range fields

This PR adds basic synthetic source support for range fields. There are
following notable properties of synthetic source produced:
* Ranges are always normalized to be inclusive on both ends (this is how
 they are stored).
* Original order of ranges is not preserved.
* Date ranges are always expressed in epoch millis, format is not
preserved.
* IP ranges are always expressed as a range of IPs while it could
have been originally provided as a CIDR.

This PR only implements retrieval of data for source reconstruction from
 doc values.
2024-04-24 11:32:20 -07:00
elasticsearchmachine
76f377b08b Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-24 10:01:46 +00:00
Kostas Krikellas
9ae0b0ce92
[TEST] Update version skip for fields with ignore_malformed (#107811)
Support for boolean field mappers was added in #94121

Fixes #107810
2024-04-24 04:54:50 -04:00
Kostas Krikellas
52e4385f1d
Test bool and numeric fields for ignore_malformed (#107769) 2024-04-23 16:32:44 +03:00
elasticsearchmachine
451d8e128f Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-23 10:01:27 +00:00
Rassyan
13dd1694f4
Add _name support for top level knn clauses (#107645)
Resolves #106254

This PR introduces `_name` support for top-level kNN queries, enabling named query functionality consistent with other query types. Changes include serialization/deserialization of the `_name` field within `KnnSearchBuilder` and handling within `KnnScoreDocQueryBuilder`.

Key Changes:
- Added `_name` field to `KnnSearchBuilder`.
- Modified serialization to include `_name`.
- Ensured `_name` is processed during query execution and included in the response.

Tests:
- Updated existing tests to cover `_name` functionality.
- Added new tests to ensure correct serialization/deserialization and response behavior.
2024-04-22 17:13:27 -04:00
Oleksandr Kolomiiets
8ed92db288
Add synthetic source support for binary fields (#107549)
Add synthetic source support for binary fields
2024-04-22 10:06:39 -07:00
Chris Hegarty
03dac8aa84
ES|QL: Update the REST API specification "stability" property from experimental to stable (#107697)
This commit updates the REST API specification "stability" property from experimental to stable, as the ES|QL endpoints move towards GA.
2024-04-22 16:22:35 +01:00
ChrisHegarty
4122d1a61a Merge branch 'main' into lucene_snapshot 2024-04-22 09:38:51 +01:00
Howard
fdbb21bba4
Support effective watermark thresholds in node stats API (#107244)
Adds to the `fs` component of the node stats API some additional values
indicating the disk watermarks that are currently in effect.

Relates #106676
2024-04-18 09:57:28 -04:00
Mary Gouseti
732c7c4c30
[DSL] Remove REST APIs for global retention (#107565) 2024-04-17 21:36:26 +03:00
Luca Cavanna
19db4903d1 Update skip for profile yaml tests following #107551 2024-04-17 20:22:53 +02:00
Luca Cavanna
223e7f829b
Avoid attempting to load the same empty field twice in fetch phase (#107551)
During the fetch phase, there's a number of stored fields that are requested explicitly or loaded by default. That information is included in `StoredFieldsSpec` that each fetch sub phase exposes.

We attempt to provide stored fields that are already loaded to the fields lookup that scripts as well as value fetchers use to load field values (via `SearchLookup`). This is done in `PreloadedFieldLookupProvider.` The current logic makes available values for fields that have been found, so that scripts or value fetchers that request them don't load them again ad-hoc. What happens though for stored fields that don't have a value for a specific doc, is that they are treated like any other field that was not requested, and loaded again, although they will not be found, which causes overhead.

This change makes available to `PreloadedFieldLookupProvider` the list of required stored fields, so that it can better distinguish between fields that we already attempted to load (although we may not have found a value for them) and those that need to be loaded ad-hoc (for instance because a script is requesting them for the first time).

This is an existing issue, that has become evident as we moved fetching of metadata fields to `FetchFieldsPhase`, that relies on value fetchers, and hence on `SearchLookup`. We end up attempting to load default metadata fields (`_ignored` and `_routing`) twice when they are not present in a document, which makes us call `LeafReader#storedFields` additional times for the same document providing a `SingleFieldVisitor` that will never find a value.

Another existing issue that this PR fixes is for the `FetchFieldsPhase` to extend the `StoredFieldsSpec` that it exposes to include the metadata fields that the phase is now responsible for loading. That results in `_ignored` being included in the output of the debug stored fields section when profiling is enabled. The fact that it was previously missing is an existing bug (it was missing in `StoredFieldLoader#fieldsToLoad`).

Yet another existing issues that this PR fixes is that `_id` has been until now always loaded on demand when requested via fetch fields or script. That is because it is not part of the preloaded stored fields that the fetch phase passes over to the `PreloadedFieldLookupProvider`. That causes overhead as the field has already been loaded, and should not be loaded once again when explicitly requested.
2024-04-17 19:37:04 +02:00
elasticsearchmachine
daa63d20ec Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-04-17 10:01:46 +00:00
David Kyle
f8fe610966
[ML] Add GET _inference for all inference endpoints (#107517) 2024-04-16 17:15:59 +01:00
Benjamin Trent
b7eba586d3
Fixing failing tests for lucene_snapshot (#107437)
* Adding confidence_interval to one of the tests

* Fixing mapper testKnnQuantizedHNSWVectorsFormat

* Adding deterministic confidence interval for int8 flat
2024-04-16 08:29:07 -04:00
Jedr Blaszyk
30be6c8f84
[Connector API] Add update filtering validation and activate draft endpoints (#107457) 2024-04-15 17:26:47 +02:00
Jedr Blaszyk
676c89e6ed
[Connector API] Unify API JSON rest spec (#107476) 2024-04-15 17:26:21 +02:00
Jonathan Buttner
001680b8e7
Muting fetch fields with none stored_fields (#107468)
Muting https://github.com/elastic/elasticsearch/issues/107466
2024-04-15 09:35:08 -04:00
Moritz Mack
e70e5397b7
Remove historical features for YAML REST tests in favor of synthetic version features (#107393) 2024-04-15 13:58:22 +02:00
Salvatore Campagna
4dfcb0897e
Fetch meta fields in FetchFieldsPhase using ValueFetcher (#106325)
Here we extract the logic to populate metadata fields such as _ignored, _routing, _size and the deprecated _type into FetchFieldsPhase so that we can use the ValueFetcher interface to retrieve field values. This allows us to fetch values no matter if the Mapper uses stored or doc values.
2024-04-15 11:02:18 +02:00
Benjamin Trent
2ea5e0804c
Fix flaky retriever test (#107380) 2024-04-12 09:52:16 +02:00
Chris Hegarty
6b52d7837b
Add an optimised int8 vector distance function for aarch64. (#106133)
This commit adds an optimised int8 vector distance implementation for aarch64. Additional platforms like, say, x64, will be added as a follow-up.

The vector distance implementation outperforms Lucene's Pamana Vector implementation for binary comparisons by approx 5x (depending on the number of dimensions). It does so by means of compiler intrinsics built into a separate native library and link by Panama's FFI. Comparisons are performed on off-heap mmap'ed vector data.

The implementation is currently only used during merging of scalar quantized segments, through a custom format ES814HnswScalarQuantizedVectorsFormat, but its usage will likely be expanded over time.

Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>
Co-authored-by: Mark Vieira <portugee@gmail.com>
Co-authored-by: Ryan Ernst <ryan@iernst.net>
2024-04-12 08:44:21 +01:00
Moritz Mack
1f5e04b721
Migrate YAML REST tests to synthetic cluster feature check (#107068)
To simplify the migration away from version based skip checks in YAML specs, 
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.

New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
2024-04-11 18:22:38 +02:00
Jedr Blaszyk
07aa9cd998
[Connector API] Support cleaning up sync jobs when deleting a connector (#107253) 2024-04-11 11:40:19 +02:00
Parker Timmins
75228dfd45
Add granular error list to alias action response (#106514)
When an alias action list is posted with must_exist==false, and succeeds only partially, a list of results for each action are now returned. The results contain information about the requested action, indices, and aliases. If must_exist==true, or all actions fail, the call will return a 400 status along with the associated exception.
2024-04-09 12:11:49 -05:00
Niels Bauman
0f3ac367ac
Rename values of FailureStoreOptions (#107062)
With these new values, there's a better match between selecting failure stores in read and write operations.
2024-04-06 08:46:38 +02:00
Benjamin Trent
5a9a9b87ac
Adding tests and fixing test failure #106964 (#107118)
closes: https://github.com/elastic/elasticsearch/issues/106964
2024-04-05 13:55:22 -04:00