Commit graph

493 commits

Author SHA1 Message Date
Rene Groeschke
6b7cd0339e
Update Gradle wrapper to 8.12 (#118683) (#119363)
This updates the gradle wrapper to 8.12

We addressed deprecation warnings due to the update that includes:

- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions

(cherry picked from commit ba61f8c7f7)
2024-12-31 08:36:31 +01:00
Carlos Delgado
d40813d78e
[8.x] kNN vector rescoring for quantized vectors (#116663) (#118418)
* kNN vector rescoring for quantized vectors (#116663)

(cherry picked from commit 59967727cf)

# Conflicts:
#	server/src/main/java/org/elasticsearch/search/vectors/KnnSearchBuilder.java
#	x-pack/plugin/rank-rrf/src/main/java/org/elasticsearch/xpack/rank/rrf/RRFRankBuilder.java

* FloatVectorValues have a different interface in this Lucene version
2024-12-11 21:17:07 +11:00
Oleksandr Kolomiiets
0b99670aeb
Migrate mapper-related modules to internal-*-rest-test (#117298) (#117406)
(cherry picked from commit 2b8e4e727c)

# Conflicts:
#	modules/mapper-extras/build.gradle
#	plugins/mapper-annotated-text/build.gradle
#	plugins/mapper-murmur3/build.gradle
#	x-pack/plugin/mapper-unsigned-long/build.gradle
#	x-pack/plugin/mapper-version/build.gradle
#	x-pack/plugin/wildcard/build.gradle
2024-11-25 08:01:29 -08:00
Kostas Krikellas
2439869034
[8.x] [TEST] Replace _source.mode with index.mapping.source.mode in integration tests - take 2 (#116072) (#116161)
* [TEST] Replace _source.mode with index.mapping.source.mode in integration tests - take 2 (#116072)

* Reapply "[TEST] Replace _source.mode with index.mapping.source.mode in integra…" (#116069)

This reverts commit e8bf344a28.

* [TEST] Replace _source.mode with index.mapping.source.mode in integration tests

* add reason

* add reason

* spotless

* revert unneeded

(cherry picked from commit 4573ab8ec1)

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/mapper/MapperFeatures.java

* Update MapperFeatures.java
2024-11-04 19:45:47 +11:00
Mark Vieira
0279c0a909
Add AGPLv3 as a supported license 2024-09-13 14:30:33 -07:00
Kostas Krikellas
f3bc281978
Refactor build params for FieldMapper, adding SourceKeepMode (#112455)
* Refactor build params for FieldMapper

* more mappers and tests

* more mappers

* more mappers

* spotless

* spotless

* stored by default

* Revert "stored by default"

This reverts commit bbd247d64b.

* restore storeIgnored

* sync

* list valid values for SourceKeepMode

* small refactoring

* spotless
2024-09-06 14:16:17 +03:00
Nhat Nguyen
1964be565c
Allow querying index_mode (#110676)
This change allows querying the `index.mode` setting via a new 
`_index_mode` metadata field, enabling APIs such as `field_caps` or
`resolve_indices` to target indices that are either time_series or logs
only. This approach avoids adding and handling a new parameter for
`index_mode` in these APIs. Both ES|QL and the `_search` API should also
work with this new field.
2024-07-10 16:45:11 -07:00
Mayya Sharipova
405e39660b
Support k parameter for knn query (#110233)
Introduce an optional k param for knn query

If k is not set, knn query has the previous behaviour:
- `num_candidates` docs  is collected from each shard. This `num_candidates` docs
are used for combining with results with other queries and aggregations on each shard.
- docs from all shards are merged to produce the top global `size` results

If k is set, the behaviour instead is following:
- `k` docs is collected from each shard. This `k` docs are used for
combining results with other queries and aggregations on each shard.
- similarly, docs from all shards are merged to produce the top global `size`
results.

Having `k` param makes it more intuitive for users to address their needs.
They also don't need to care and can skip `num_candidates` param for this query
as it is of more internal details to tune how knn search operates.

Closes #108473
2024-06-28 09:59:28 -04:00
Luca Cavanna
915e4a50c5
Rename Mapper#name to Mapper#fullPath (#110040)
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while the MappedFieldType name does.

We have renamed Mapper.Builder#name to leafName (#109971) and Mapper#simpleName to leafName (#110030). This commit renames Mapper#name to fullPath for clarity
This required some adjustments in FieldAliasMapper to avoid confusion between the existing path method and fullPath. I renamed path to targetPath for clarity.
ObjectMapper already had a fullPath method that returned name, and was effectively a copy of name, so it could be removed.
2024-06-21 22:47:27 +02:00
Luca Cavanna
54e7b4d93b
Rename Mapper#simpleName to Mapper#leafName (#110030)
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while
the MappedFieldType name does. We have method called simpleName to signal that, but leafName signals that more clearly and aligns with
the name we have recently introduced in Mapper.Builder (renamed from name to leafName).

Relates to #109971
2024-06-21 14:28:36 +02:00
Luca Cavanna
15c7abe111
Rename Mapper#name to Mapper#leafName (#109971)
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while
the MappedFieldType name does.
2024-06-21 11:48:17 +02:00
Oleksandr Kolomiiets
1080425a65
Enable fallback synthetic source by default (#109370) 2024-06-07 09:21:22 -07:00
Panagiotis Bailis
1c3b3d8f11
Adding support for explain in rrf (#108682) 2024-06-07 11:09:06 +03:00
Oleksandr Kolomiiets
75b5efede4
Binary field enables doc values by default for index mode with synthetic source (#107739)
Binary field enables doc values by default for index mode with synthetic source
2024-04-23 08:24:47 -07:00
Mayya Sharipova
965ebab631
Percolator named queries: rewrite for matched info (#107432)
PR #103084 introduced an ability to return matched_queries during percolate
process for all percolator queries containing `_name` field.

But there was a bug with complex queries, as they were not rewritten before
obraining their Weight function. This fixes the bug by ensuring all
queries are first rewritten.

Closes #107176
2024-04-12 13:44:50 -04:00
Moritz Mack
1f5e04b721
Migrate YAML REST tests to synthetic cluster feature check (#107068)
To simplify the migration away from version based skip checks in YAML specs, 
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.

New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
2024-04-11 18:22:38 +02:00
Felix Barnsteiner
ab52ef1f06
Fix merging component templates with a mix of dotted and nested object mapper definitions (#106077)
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
2024-04-08 17:55:41 +02:00
Felix Barnsteiner
dee0be589c
Flatten object mappings when subobjects is false (#103542) 2024-02-22 11:43:12 +01:00
Felix Barnsteiner
5920c917aa
Encapsulate Mapper.Builder#name and make it private (#105648)
This is in preparation to make the field mutable,
which is needed in the context of https://github.com/elastic/elasticsearch/pull/103542
2024-02-20 15:53:14 +01:00
Armin Braun
73a68409c2
Ref count search response bytes (#103763)
Final step in  #102030 ... actually makes `SearchHit` read a releasable bytes reference.
Does still fallback to copying to unrolled buffers here and there which can be removed in follow-ups where it's worth the effort (aggs being the most important one probably).

Hard to create very reliable benchmarks for this because all our macro-benchmarks are quite noisy. Running http logs and PMC though, there's a statistically significant reduction in GC and reduced tail latencies in most benchmarks.

The overhead for ref-counting these bytes isn't visible in profiling as far as I can tell and for large source values, no corresponding large `byte[]` are created any longer outside of the few remaining spots where we copy to pooled buffers.

closes #102657
closes #102030
2024-01-17 16:16:39 +01:00
Armin Braun
80a95087db
Fix more search response leaks (#103956)
Some more mechanical fixing of leaked SearchResponse instances.
2024-01-05 10:40:59 +01:00
Mayya Sharipova
b014843078
Return matched_queries in Percolator (#103084)
Return matched_queries for named queries in Percolator.

In a response, each hit together with
a `_percolator_document_slot` field will contain
`_percolator_document_slot_<slotNumber>_matched_queries` fields that will show
which sub-queries matched each percolated document.

Closes #10163
2023-12-11 09:07:26 -05:00
Ignacio Vera
32a8c683f9
Use ElasticsearchAssertions#asserResponse for MultiSearchResponse (#102694) 2023-11-28 13:52:26 +01:00
Armin Braun
2bd0a709fe
Fix ref counts for MultiSearchResponse not released in tests (#102604)
Part of the effort to fix search response leaks is to fix these. Fixed
all that I could easily find in tests. Production changes incoming once
the dependencies for those are fixed.

part of #102030 but no fancy utility here like for search responses
since we don't have so many use cases an none of them are tricky.
2023-11-24 12:51:40 -05:00
Armin Braun
cdc83ad29b
Add shorthand for prepareIndex to test infrastructure (#101187)
Same as #101175, shorten `client().prepareIndex(index)` and
`client().prepareIndex().setIndex(index)` via a test utility.
Saves lots of code now and sets up some follow-up simplifcations.
2023-11-23 15:47:36 +01:00
Armin Braun
a9c286b25c
Collapse verbose .execute().actionGet() calls in tests (#102502)
Cleaning this up a little even though it's still quite horrible.
`.get()` in this API actually means `actionGet()` so to speak.
I think a good first step to cleaning this up is to at least reduce
the duplication though and save 1k lines.
2023-11-23 10:10:10 +01:00
Armin Braun
e03b0a5329
Add Leak Tracking to the SearchContext implementations (#102274)
Another step towards ref counting search hits. This adds leak tracking to the search context. Required 2 fixes in the production code to not fail tests: sub aggregations need to be closed eventually, found it easiest to just tie this to the parent context. If we throw in the constructor of the context (we have tests for this case), we should release/close it still (it's just impossible to fix the leak tracking otherwise, also it seems to me that this is more correct anyway since we initialise resources in that constructor).
Other than that, just trivial test changes to make sure the contexts get closed everywhere.
2023-11-16 15:34:12 +01:00
Panagiotis Bailis
8f108ec9e9
Removing explicit SearchResponse usages in tests - v3 (#102019)
Tests covered in this PR:

* `org.elasticsearch.percolator.PercolatorQuerySearchIT`
2023-11-13 07:33:30 -05:00
Mayya Sharipova
61c7483fc9
Make knn search a query (#98916)
This introduced a new knn query:
- knn query is executed during the Query phase similar to all other queries.
- No k parameter, k defaults to  size
- num_candidates is a size of queue for candidates to consider while
  search a graph on each shard
- For aggregations: "size" results are collected with total = size * shards.
   Aggregations will see size * shards results.
- All filters from DSL are applied as post-filters, except: 1) alias filter
 is applied as  pre-filter or 2) a filter provided as a parameter
 inside knn query.
2023-11-01 14:21:40 -04:00
Luca Cavanna
b07feb507d
Percolator to support parsing script score query with params (#101051)
While dot expansion is disabled when parsing percolator queries at index
time, as that would interfere with query parsing,  we still use a wrapper parser
that is conservative about what methods it supports, assuming that
document parsing needs nextToken and not much more. Turns out that when
parsing queries instead, we need to support all the XContentParser
methods including map, list etc.

This commit adds a test for script score query parsing through document
parsing via percolator field mapper, and removes the limitations in the
wrapper parser when dots expansion is disabled.
2023-10-24 11:03:28 +02:00
David Turner
9794c6e205
Use ESIntegTestCase#prepareSearch more (#101179)
The refactoring in #101175 only covered all the one-arg call sites. This
PR does the rest.
2023-10-20 18:33:00 +01:00
David Turner
1eda6ac74b
Extract ESIntegTestCase#prepareSearch (#101175)
Relates #101172
2023-10-20 06:18:58 -04:00
Ryan Ernst
8a1db8c6c3
Move index version constants to IndexVersions (#101094)
Similar to the TransportVersions holder class, IndexVersions is the new
place to contain all constants for IndexVersion. This commit moves all
existing constants to the new class. It is purely mechanical.
2023-10-19 20:44:51 -04:00
Armin Braun
03ea4bbe6e
Remove more explicit references to SearchResponse in tests (#101052)
Follow up to #100966 introducing new combined assertion `assertSearchHitsWithoutFailures`
to combine no-failure, count, and id assertions into one block.
2023-10-18 20:27:52 +02:00
Armin Braun
bae6991fb3
Remove ~600 references to SearchResponse in tests (#100966)
We'd like to make `SearchResponse` reference counted and pooled but there are around 6k
instances of tests that create a `SearchResponse` local variable that would need to be
released manually to avoid leaks in the tests.
This does away with about 10% of these spots by adding an override for `assertHitCount`
that handles the actual execution of the search request and its release automatically
and making use of it in all spots where the `.get()` on the request build could be inlined
semi-automatically and in a straight-forward fashion without other code changes.
2023-10-17 15:43:36 +02:00
Armin Braun
b7eafce32c
Make some practically static methods static (#97565)
Another round of automated fixes to this, marking things that can be
made static as static. Saves some JIT cycles but also turns some lambdas
from capturing to non-capturing and makes the "utilityness" of some
classes visible.
2023-10-06 23:37:07 +02:00
Alan Woodward
4e1fb3fca5
Automatically disable ignore_malformed on datastream @timestamp fields (#99346)
Data-stream mappings require a @timestamp field to be present and configured
as a date with a specific set of parameters. The index-wide setting of
ignore_malformed can cause problems here if it is set to true, because it needs
to be false for the @timestamp field.

This commit detects if a set of mappings is configured for a datastream by checking
for the presence of a DataStreamTimestampFieldMapper metadata field, and passes
that information on during Mapper construction as part of the MapperBuilderContext.
DateFieldMapper.Builder now checks to see if it is specifically for a data stream timestamp
field, and if it is, sets ignore_malformed to false.

Relates to #96051
2023-09-13 15:02:22 +01:00
Armin Braun
f1a376c317
Remove CopyTo.Builder (#99368)
The copyTo builder is really hard to reason about when it comes to
mapper merging, because the `reset` method would actually mutate an
existing mapper. That seems dangerous and the whole thing is quite
inefficient as well. -> this PR just removes it and uses a copy
constructor for copy on write, avoiding instance creation on mapper
merges here and there and leaving no doubt about these things being
immutable.
2023-09-08 13:24:31 -04:00
Ryan Ernst
19257125b1
Move transport version constants to TransportVersions (#97990)
Constants for TransportVersion currently live alongeside the class
definition. This has been fine since there was only one set of
constants. However, to support serverless, some constants will need to
be defined elsewhere.

This commit moves the existing constants to a new holder class,
TransportVersions. It is almost entirely mechanical, using IntelliJ move
members. The only non mechanical part was slightly shifting how CURRENT
is found, defining a LATEST in TransportVersions that is automatically
calculated (since we already have it, no need to manually define it).
2023-09-06 15:14:41 -04:00
David Turner
1e9c7f1d95
Align collection de/serialization API naming (#99150)
The `StreamOutput` and `StreamInput` APIs are designed so that code
which serializes objects to the transport protocol aligns closely with
the corresponding deserialization code. However today
`StreamOutput#writeCollection` pairs up with a variety of methods on
`StreamInput`, including `readList`, `readSet`, and so on. These methods
are not obviously compatible with `writeCollection` unless you look at
the implementation, and that makes verifying transport protocol code
harder than it needs to be.

This commit renames these methods to `readCollectionAsList`,
`readCollectionAsSet`, and so on, to clarify that they are compatible
with `writeCollection`.

Relates
https://github.com/elastic/elasticsearch/pull/98971#issuecomment-1697289815
2023-09-04 06:46:54 -04:00
Benjamin Trent
d09cb767a9
Fix percolator query for stored queries that expand on wildcard field names (#98878)
An optimization introduced in:
https://github.com/elastic/elasticsearch/pull/81985 changed percolator
query behavior.

Users can specify a percolator query which expands fields based on a
wildcard pattern. Just one example is `simple_query_string`, which
allows field names like `"text_*"`. The user expects that this field
name will expand to relevant mapped fields (e.g. "text_foo"). However,
if there are no documents indexed in those fields at the time when the
percolator query is indexed, it doesn't expand to the relevant fields.

Additionally at query time, we may skip expanding fields and not match
the relevant mapped fields if they are considered "empty" (e.g. has no
values in the shard). We should instead allow expansion by indicating
that the field may exist in the shard.

closes: https://github.com/elastic/elasticsearch/issues/98819
2023-08-28 09:19:28 -04:00
Matteo Piergiovanni
e719057209
Explicit parsing object capabilities of FieldMappers (#98684)
When the subobject property is set to false and we encounter an object 
while parsing we need a way to understand if its FieldMapper is able to 
parse an object. If that's the case we can provide the entire object to 
the FieldMapper otherwise its name becomes the part of the dotted field
name of each internal value.

This has being achieved by adding the `supportsParsingObject()` method 
to the `FieldMapper` class. This method defaults to `false` since the 
majority of FieldMappers do not support parsing objects and is 
overwritten to return `true` by the ones that do support objects.
2023-08-22 10:16:59 +02:00
Christoph Büscher
207a995fce
Use newSearcher instead of new IndexSearcher in tests where possible (#98110)
This change swaps test code that directly creates IndexSearcher instances with LuceneTestCase#newSearcher calls
that have the advantage of randomly using concurrency and also randomly use assertion wrappers internally.
While this doesn't guarantee testing the concurrent code path, it should generally increase the likelihood of doing so.
2023-08-22 10:49:21 +07:00
Armin Braun
63e64ae61b
Cleanup Stream usage in various spots (#97306)
Lots of spots where we did weird things around streams like redundant stream creation, redundant collecting
before adding all the collected elements to another collection or so, redundant streams for joining strings
and using less efficient `Collectors.toList` and in a few cases also incorrectly relying on the result being mutable.
2023-07-03 14:24:57 +02:00
Simon Cooper
a873e26cf7
Convert IndexVersion.CURRENT to a method with a pluggable interface (#97132) 2023-06-27 14:47:32 +01:00
Armin Braun
dd7d381922
Dry up getting cluster admin client in tests (#96952)
Drying this up further and adding the same short-cut for single node
tests. Dealing with most of the spots that I could grab via automatic
refactorings.
2023-06-22 14:27:23 +02:00
Armin Braun
3f8ee82ef8
Use indices admin client shortcut in most integration tests (#96946)
Replacing the remaining usages that I could automatically replace
and a couple that I did by hand in this PR.
Also, added the same shortcut to the single node tests to save some
duplication there.
2023-06-20 13:32:59 +02:00
Simon Cooper
71c12262fb
Migrate index created version to IndexVersion (#96066) 2023-06-14 09:43:31 +01:00
Ryan Ernst
164e97e2ca
Encapsulate TransportVersion.CURRENT (#96681)
This commit changes access to the latest TransportVersion constant to
use a static method instead of a public static field. By encapsulating
the field we will be able to (in a followup) lazily determine what the
latest is, outside of clinit.
2023-06-13 18:44:15 -04:00
Armin Braun
414eda7b80
Cheaper ActionListener.wrap when error handler is the listener (#96575)
Motivated by looking into allocations of listeners in detail for shared cache benchmarking.
Wrapping a listener and using `listener::onFailure` as the failure callback means that we
have a reference to the listener from both the failure and the response handler.
If we use the approach used by the `.deleteGate*` methods, we can often save allocating
a response handler lambda or at least make the response handler cheaper.
We also save allocating the failure handler lambda.
2023-06-06 11:42:39 +02:00