Commit graph

332 commits

Author SHA1 Message Date
Armin Braun
f461f90d48
Remove redundant marker interfaces that extend Bucket (#127038)
No need to have these marker interfaces around when weäre not using them anywhere, all they do is hide a lot of code duplication actually. Removing them sets up the possible removal of hundreds of lines of downstream code it seems
2025-04-18 18:26:39 +02:00
David Turner
77a3d30d26
Remove trappy timeouts from IndicesAliasesRequest (#123987)
Relates #107984
2025-03-05 02:11:50 +11:00
Luca Cavanna
8baba58529
Address TopFieldCollectorManager and TopScoreDocCollectorManager related deprecation warnings (#123615)
The supportsConcurrency flag has been deprecated in Lucene,
see https://github.com/apache/lucene/pull/13977 .
2025-03-03 10:38:52 +01:00
Ignacio Vera
9a9bc69883
Stop caching source map on SearchHit#getSourceMap (#119888)
This call has the side effect that if you are iterating a number of hits calling this method, you will be increasing the 
memory usage by a non trivial number which in most of cases is unwanted. Therefore this commit removes this caching
all together and add an assertion so the method is call once during the lifetime of the object.
2025-01-23 17:28:52 +01:00
Rene Groeschke
ba61f8c7f7
Update Gradle wrapper to 8.12 (#118683)
This updates the gradle wrapper to 8.12

We addressed deprecation warnings due to the update that includes:

- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions
2024-12-30 15:34:24 +01:00
Oleksandr Kolomiiets
2b8e4e727c
Migrate mapper-related modules to internal-*-rest-test (#117298) 2024-11-23 00:35:24 +00:00
Ignacio Vera
9296fb40ff
Use LongArray instead of long[] for owning ordinals when building Internal aggregations (#116874)
This commit changes the signature of InternalAggregation#buildAggregations(long[]) to
InternalAggregation#buildAggregations(LongArray) to avoid allocations of humongous arrays.
2024-11-19 12:26:37 +01:00
Kostas Krikellas
4573ab8ec1
[TEST] Replace _source.mode with index.mapping.source.mode in integration tests - take 2 (#116072)
* Reapply "[TEST] Replace _source.mode with index.mapping.source.mode in integra…" (#116069)

This reverts commit e8bf344a28.

* [TEST] Replace _source.mode with index.mapping.source.mode in integration tests

* add reason

* add reason

* spotless

* revert unneeded
2024-11-04 09:39:34 +02:00
Kostas Krikellas
e8bf344a28
Revert "[TEST] Replace _source.mode with index.mapping.source.mode in integra…" (#116069)
This reverts commit a360757968.
2024-11-01 10:53:08 +02:00
Kostas Krikellas
a360757968
[TEST] Replace _source.mode with index.mapping.source.mode in integration tests (#115926)
* Replace _source.mode with index.mapping.source.mode in integration tests

* fix tests

* revert 40_source_mode_setting.yml
2024-11-01 09:46:06 +02:00
Luca Cavanna
8efd08b019
Upgrade to Lucene 10 (#114741)
The most relevant ES changes that upgrading to Lucene 10 requires are:

- use the appropriate IOContext
- Scorer / ScorerSupplier breaking changes
- Regex automaton are no longer determinized by default
- minimize moved to test classes
- introduce Elasticsearch900Codec
- adjust slicing code according to the added support for intra-segment concurrency
- disable intra-segment concurrency in tests
- adjust accessor methods for many Lucene classes that became a record
- adapt to breaking changes in the analysis area

Co-authored-by: Christoph Büscher <christophbuescher@posteo.de>
Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
Co-authored-by: ChrisHegarty <chegar999@gmail.com>
Co-authored-by: Brian Seeders <brian.seeders@elastic.co>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: Panagiotis Bailis <pmpailis@gmail.com>
Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>
2024-10-21 13:38:23 +02:00
Mark Vieira
a59c182f9f
Add AGPLv3 as a supported license 2024-09-13 15:29:46 -07:00
Mark Vieira
4ce661cc48
Bump Elasticsearch version to 9.0.0 (#112570) 2024-09-11 09:40:11 -07:00
Kostas Krikellas
f3bc281978
Refactor build params for FieldMapper, adding SourceKeepMode (#112455)
* Refactor build params for FieldMapper

* more mappers and tests

* more mappers

* more mappers

* spotless

* spotless

* stored by default

* Revert "stored by default"

This reverts commit bbd247d64b.

* restore storeIgnored

* sync

* list valid values for SourceKeepMode

* small refactoring

* spotless
2024-09-06 14:16:17 +03:00
Christoph Büscher
5e455db10e Revert "Remove Scorable#docID implementations"
This reverts commit 55ed03fddf.
2024-08-29 10:04:27 +02:00
Christoph Büscher
55ed03fddf Remove Scorable#docID implementations
This method was removed in https://github.com/apache/lucene/pull/12407 so
we also need to remove it in implementations of Scorable.
2024-08-29 10:03:28 +02:00
Ignacio Vera
e3e44af4cf
Remove InternalAggregations#asMap and InternalAggregations#getAsMap methods (#110250)
This methods are not needed as we have InternalAggregations#get to find aggregations by name.
2024-07-01 08:27:14 +02:00
Luca Cavanna
915e4a50c5
Rename Mapper#name to Mapper#fullPath (#110040)
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while the MappedFieldType name does.

We have renamed Mapper.Builder#name to leafName (#109971) and Mapper#simpleName to leafName (#110030). This commit renames Mapper#name to fullPath for clarity
This required some adjustments in FieldAliasMapper to avoid confusion between the existing path method and fullPath. I renamed path to targetPath for clarity.
ObjectMapper already had a fullPath method that returned name, and was effectively a copy of name, so it could be removed.
2024-06-21 22:47:27 +02:00
Luca Cavanna
54e7b4d93b
Rename Mapper#simpleName to Mapper#leafName (#110030)
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while
the MappedFieldType name does. We have method called simpleName to signal that, but leafName signals that more clearly and aligns with
the name we have recently introduced in Mapper.Builder (renamed from name to leafName).

Relates to #109971
2024-06-21 14:28:36 +02:00
Luca Cavanna
15c7abe111
Rename Mapper#name to Mapper#leafName (#109971)
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while
the MappedFieldType name does.
2024-06-21 11:48:17 +02:00
Oleksandr Kolomiiets
1080425a65
Enable fallback synthetic source by default (#109370) 2024-06-07 09:21:22 -07:00
Moritz Mack
b71fc0c561
Migrate remaining usage of skip version in YAML specs to cluster_features (#108055) 2024-05-07 09:42:17 +02:00
eyalkoren
ee262954ee
Adding aggregations support for the _ignored field (#101373)
Enables aggregations on the _ignored metadata field replacing the stored field
with doc values.
2024-04-29 16:41:34 +02:00
Luca Cavanna
223e7f829b
Avoid attempting to load the same empty field twice in fetch phase (#107551)
During the fetch phase, there's a number of stored fields that are requested explicitly or loaded by default. That information is included in `StoredFieldsSpec` that each fetch sub phase exposes.

We attempt to provide stored fields that are already loaded to the fields lookup that scripts as well as value fetchers use to load field values (via `SearchLookup`). This is done in `PreloadedFieldLookupProvider.` The current logic makes available values for fields that have been found, so that scripts or value fetchers that request them don't load them again ad-hoc. What happens though for stored fields that don't have a value for a specific doc, is that they are treated like any other field that was not requested, and loaded again, although they will not be found, which causes overhead.

This change makes available to `PreloadedFieldLookupProvider` the list of required stored fields, so that it can better distinguish between fields that we already attempted to load (although we may not have found a value for them) and those that need to be loaded ad-hoc (for instance because a script is requesting them for the first time).

This is an existing issue, that has become evident as we moved fetching of metadata fields to `FetchFieldsPhase`, that relies on value fetchers, and hence on `SearchLookup`. We end up attempting to load default metadata fields (`_ignored` and `_routing`) twice when they are not present in a document, which makes us call `LeafReader#storedFields` additional times for the same document providing a `SingleFieldVisitor` that will never find a value.

Another existing issue that this PR fixes is for the `FetchFieldsPhase` to extend the `StoredFieldsSpec` that it exposes to include the metadata fields that the phase is now responsible for loading. That results in `_ignored` being included in the output of the debug stored fields section when profiling is enabled. The fact that it was previously missing is an existing bug (it was missing in `StoredFieldLoader#fieldsToLoad`).

Yet another existing issues that this PR fixes is that `_id` has been until now always loaded on demand when requested via fetch fields or script. That is because it is not part of the preloaded stored fields that the fetch phase passes over to the `PreloadedFieldLookupProvider`. That causes overhead as the field has already been loaded, and should not be loaded once again when explicitly requested.
2024-04-17 19:37:04 +02:00
Salvatore Campagna
4dfcb0897e
Fetch meta fields in FetchFieldsPhase using ValueFetcher (#106325)
Here we extract the logic to populate metadata fields such as _ignored, _routing, _size and the deprecated _type into FetchFieldsPhase so that we can use the ValueFetcher interface to retrieve field values. This allows us to fetch values no matter if the Mapper uses stored or doc values.
2024-04-15 11:02:18 +02:00
Moritz Mack
1f5e04b721
Migrate YAML REST tests to synthetic cluster feature check (#107068)
To simplify the migration away from version based skip checks in YAML specs, 
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.

New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
2024-04-11 18:22:38 +02:00
Kostas Krikellas
e58f4b4ef9
Introduce TimeSeriesRoutingIdFieldMapper and use it to create TSDB ids (#106080)
Supporting non-keyword fields requires updating non-keyword fields in
the routing path to be included in routing calculations. Routing is
performed in coordinating nodes that lack mappings (or mappings haven't
been created yet, for dynamically-defined dimensions), so the routing
hash they calculate are passed to data nodes and stored in a new fields,
namely _ts_routind_hash. This is included in the _id field, in turn, so
that it can consistently reach the right shard for get-by-id and
delete-by-id operations.

A few interesting points:

- The hash is passed from the coordinating to data nodes using the `routing` field in `IndexRequest`; adding another field to the latter requires updating dozens of classes.
- We explicitly skip (double-) storing the hash to the routing field, as the latter is not optimized for storage using the TSDB codec.
- The routing hash may not be available in Translog operations, it can then be retrieved from the `id` prefix.

Related to https://github.com/elastic/elasticsearch/issues/103567
2024-03-13 11:37:09 -04:00
Benjamin Trent
8a7dfdfe24
Deprecate allowing fields in scenarios where its ignored (#106031)
closes: https://github.com/elastic/elasticsearch/issues/106026
2024-03-08 08:02:50 -05:00
Armin Braun
1f1636e1f7
Fix error 500 on invalid ParentIdQuery (#105693)
We need to enforce non-null values here, otherwise we'll error out and return
a 500 when a user fails to set either id or type.

closes #105366
2024-02-21 14:09:54 +01:00
Felix Barnsteiner
5920c917aa
Encapsulate Mapper.Builder#name and make it private (#105648)
This is in preparation to make the field mutable,
which is needed in the context of https://github.com/elastic/elasticsearch/pull/103542
2024-02-20 15:53:14 +01:00
John Verwolf
98a37c7b6b
Enhancement: Metrics for Search Took Times using Action Listeners (#104996)
* Instrument search took times

* Update assertion helper method to use client param

* Update docs/changelog/104996.yaml

* spotless

* Fix test
2024-02-01 12:51:12 -08:00
Ignacio Vera
79f801b754
Remove unused ParsedAggregation (#104848)
This abstraction was introduced to support the high level rest client and is not needed any more.
2024-01-29 14:59:11 +01:00
Ignacio Vera
72fe8b30d5
Remove ParsedAggregation from tests (#104790)
This commit removes any reference to ParsedAggregation from the test framework
2024-01-29 07:46:06 +01:00
John Verwolf
80b222c395
Update elasticsearch.modules.parent-join.internalClusterTest (#102189)
Part of the broader work covered in https://github.com/elastic/elasticsearch/issues/102030

Updates tests in:

- ChildQuerySearchIT
- TokenCountFieldMapperIntegrationIT
2023-12-10 00:09:02 +01:00
Armin Braun
143f4208d1
Fix remaining leaked SearchResponse issues in :server:integTests (#102896)
This should be the last round for this module, found these using a prototype
that has `SearchResponse` ref-counted already.
2023-12-04 19:09:48 +01:00
Luca Cavanna
2e0600dc4a
Enable inter-segment concurrency for terms aggs (#101390)
This commit enables inter-segment search concurrency for terms aggs, when the cardinality of the field being aggregated on is lower than the shard size. This is to avoid precision errors compared that would be caused by parallelizing the executor across slices for fields with high cardinality.

For terms agg that are ordered by key, we still take cardinality into account to parallelize only low cardinality fields and avoid performance overhead caused by parallelizing high cardinality fields.
2023-11-27 14:01:06 +01:00
Armin Braun
cdc83ad29b
Add shorthand for prepareIndex to test infrastructure (#101187)
Same as #101175, shorten `client().prepareIndex(index)` and
`client().prepareIndex().setIndex(index)` via a test utility.
Saves lots of code now and sets up some follow-up simplifcations.
2023-11-23 15:47:36 +01:00
Armin Braun
a9c286b25c
Collapse verbose .execute().actionGet() calls in tests (#102502)
Cleaning this up a little even though it's still quite horrible.
`.get()` in this API actually means `actionGet()` so to speak.
I think a good first step to cleaning this up is to at least reduce
the duplication though and save 1k lines.
2023-11-23 10:10:10 +01:00
Armin Braun
433517ad01
Misc cleanup in o.e.search.fetch (#101939)
Just some random findings from researching other things.
Removing all kinds of dead code and fixing obvious duplication in 2 spots.
2023-11-09 13:46:00 +01:00
Armin Braun
ae6d180379
Clean up some more dead code in o.e.s.aggregations (#101820)
Another iteration of mostly automatic cleanup on top of #101806.
2023-11-07 20:07:17 +01:00
David Turner
4a37aef80b
Deprecate ExternalTestCluster (#101844)
`ExternalTestCluster` doesn't really make sense now that the transport
client is removed. We only use it in the ML integ test suite and it'd be
good to avoid expanding its usage further, so this commit deprecates it
and removes the functionality in `ESIntegTestCase` that might quietly
switch to using it in a new test suite if running with certain system
properties.

Relates #49582
2023-11-06 14:30:07 -05:00
Ignacio Vera
8a9f4fed55
Remove explicit SearchResponse references from LegacyGeo, Aggregations and parent-join modules (#101250) 2023-10-24 17:46:25 +02:00
David Turner
9794c6e205
Use ESIntegTestCase#prepareSearch more (#101179)
The refactoring in #101175 only covered all the one-arg call sites. This
PR does the rest.
2023-10-20 18:33:00 +01:00
David Turner
1eda6ac74b
Extract ESIntegTestCase#prepareSearch (#101175)
Relates #101172
2023-10-20 06:18:58 -04:00
Armin Braun
ca6295e582
Remove more explicit references to SearchResponse in tests (#101092)
Remove `assertSearchResponse` which was just an alias for
`assertNoFailures` and then cleanup many spots in the result
by combining the hit count and no failure assertion into a single
method.

follow-up to #100966
2023-10-19 17:53:13 +02:00
Armin Braun
03ea4bbe6e
Remove more explicit references to SearchResponse in tests (#101052)
Follow up to #100966 introducing new combined assertion `assertSearchHitsWithoutFailures`
to combine no-failure, count, and id assertions into one block.
2023-10-18 20:27:52 +02:00
Armin Braun
dcaba064dd
Remove more explicit SearchResponse references from test code (#100985)
Follow-up to #100966 adding more overrides to assertions that
consume a request builder.
2023-10-18 07:20:01 +02:00
Armin Braun
bae6991fb3
Remove ~600 references to SearchResponse in tests (#100966)
We'd like to make `SearchResponse` reference counted and pooled but there are around 6k
instances of tests that create a `SearchResponse` local variable that would need to be
released manually to avoid leaks in the tests.
This does away with about 10% of these spots by adding an override for `assertHitCount`
that handles the actual execution of the search request and its release automatically
and making use of it in all spots where the `.get()` on the request build could be inlined
semi-automatically and in a straight-forward fashion without other code changes.
2023-10-17 15:43:36 +02:00
Armin Braun
b7eafce32c
Make some practically static methods static (#97565)
Another round of automated fixes to this, marking things that can be
made static as static. Saves some JIT cycles but also turns some lambdas
from capturing to non-capturing and makes the "utilityness" of some
classes visible.
2023-10-06 23:37:07 +02:00
Mark Tozzi
6660503592
Aggs error codes part 1 (#99963)
As part of our effort to increase the supportability of Elasticsearch,
this PR changes many aggregations errors from being 500 class (which is
the default for `AggregationExecutionException`) to 400 class (which is
the default for `IllegalArgumentException`).  All of these cases are
errors which should not be retried, as they are failing directly related
to the content of the request and/or state of the index.

There are definitely more cases where we are returning an incorrect
error code, but for this PR I focused on just changing the low hanging
fruit.
2023-10-04 16:12:34 -04:00