Features added before 8.18 can be removed, starting with 9.0. But first they need to be marked as assumed, so existing code knows they could be removed in later builds.
This updates the gradle wrapper to 8.12
We addressed deprecation warnings due to the update that includes:
- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions
Static fields dont do well in Gradle with configuration cache enabled.
- Use buildParams extension in build scripts
- Keep BuildParams.ci for now for easy serverless migration
- Tweak testing doc
The most relevant ES changes that upgrading to Lucene 10 requires are:
- use the appropriate IOContext
- Scorer / ScorerSupplier breaking changes
- Regex automaton are no longer determinized by default
- minimize moved to test classes
- introduce Elasticsearch900Codec
- adjust slicing code according to the added support for intra-segment concurrency
- disable intra-segment concurrency in tests
- adjust accessor methods for many Lucene classes that became a record
- adapt to breaking changes in the analysis area
Co-authored-by: Christoph Büscher <christophbuescher@posteo.de>
Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>
Co-authored-by: ChrisHegarty <chegar999@gmail.com>
Co-authored-by: Brian Seeders <brian.seeders@elastic.co>
Co-authored-by: Armin Braun <me@obrown.io>
Co-authored-by: Panagiotis Bailis <pmpailis@gmail.com>
Co-authored-by: Benjamin Trent <4357155+benwtrent@users.noreply.github.com>
* Refactor build params for FieldMapper
* more mappers and tests
* more mappers
* more mappers
* spotless
* spotless
* stored by default
* Revert "stored by default"
This reverts commit bbd247d64b.
* restore storeIgnored
* sync
* list valid values for SourceKeepMode
* small refactoring
* spotless
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while the MappedFieldType name does.
We have renamed Mapper.Builder#name to leafName (#109971) and Mapper#simpleName to leafName (#110030). This commit renames Mapper#name to fullPath for clarity
This required some adjustments in FieldAliasMapper to avoid confusion between the existing path method and fullPath. I renamed path to targetPath for clarity.
ObjectMapper already had a fullPath method that returned name, and was effectively a copy of name, so it could be removed.
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while
the MappedFieldType name does. We have method called simpleName to signal that, but leafName signals that more clearly and aligns with
the name we have recently introduced in Mapper.Builder (renamed from name to leafName).
Relates to #109971
This addresses a long standing TODO that caused quite a few bugs over time, in that the mapper name does not include its full path, while
the MappedFieldType name does.
This PR uses infrastructure from #107567 to implement a fallback implementation of synthetic source for field mappers that don't support it natively. In that case we will store source of such field as is in a separate stored field.
This PR adds synthetic source support for annotated_text fields. Existing implementation for text is reused including test infrastructure so the majority of the change is moving and making things accessible.
Contributes to #106460, #78744.
To simplify the migration away from version based skip checks in YAML specs,
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.
New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
Data-stream mappings require a @timestamp field to be present and configured
as a date with a specific set of parameters. The index-wide setting of
ignore_malformed can cause problems here if it is set to true, because it needs
to be false for the @timestamp field.
This commit detects if a set of mappings is configured for a datastream by checking
for the presence of a DataStreamTimestampFieldMapper metadata field, and passes
that information on during Mapper construction as part of the MapperBuilderContext.
DateFieldMapper.Builder now checks to see if it is specifically for a data stream timestamp
field, and if it is, sets ignore_malformed to false.
Relates to #96051
We mostly have a handful of `FieldType` values here across all mappers and none of them contain
attributes. There's only so many combinations here, lets deduplicate these to save some heap and set up
subsequent mapper heap savings.
The copyTo builder is really hard to reason about when it comes to
mapper merging, because the `reset` method would actually mutate an
existing mapper. That seems dangerous and the whole thing is quite
inefficient as well. -> this PR just removes it and uses a copy
constructor for copy on write, avoiding instance creation on mapper
merges here and there and leaving no doubt about these things being
immutable.
This PR adapts the unified highlighter to use the Weight#matches mode by default when possible.
This is the default mode in Lucene for some time now. For cases where the matches mode won't work (nested and parent-child queries),
the matches mode is disabled automatically.
I didn't expose an option to explicitly disable this mode because that should be seen as an internal implementation detail.
With this change, matches that span multiple terms are highlighted together (something that users asked for years) and the clauses that don't match the document are ignored.
Document parsing methods currently throw MapperParsingException. This
isn't very helpful, as it doesn't contain any information about where the parse
error happened - it is designed for parsing mappings, which are realised into
java maps before being examined. This commit introduces a new exception
specifically for document parsing that extends XContentException, so that
it reports the current position of the parser as part of its error message.
Fixes#85083
IndexAnalyzers is currently always a concrete class wrapping several
Maps of NamedAnalyzers. This means that whenever it is used it needs
to instantiate all of its component analyzers, making testing much heavier
than it needs to be. It also means that things like overriding analysis for
legacy indexes is pushed into mapper parameters, rather than being
handled in a single place.
This commit makes IndexAnalyzers into an interface, with an anonymous
concrete implementation that handles reloading and closing for index
shards.
This was only needed because the percolator uses a MemoryIndex which did
not support stored fields, and so when it ran a highlighting phase it needed to
force it to read from source. MemoryIndex added stored fields support in
lucene 9.5, so we can remove this internal parameter.
The parameter remains available, but deprecated, via the rest layer, and no
longer has any effect.
The annotation highlighter can miss annotations if they overlap with another search
term. This commit re-sorts incoming passages to ensure that all terms are seen
by the highlighter.
Fixes#91944
This commit adds a new test framework for configuring and orchestrating
test clusters for both Java and YAML REST testing. This will eventually
replace the existing "test-clusters" Gradle plugin and the build-time
cluster orchestration.
This adds support for the `fields` API from painless in to synthetic
_source for `text` fields that have a keyword sub field. This is kind of
esoteric sounding, but it's the default mapping for strings you send in
json so synthetic `_source` supports it. So the `fields` API should too.
This adds support for the `field` scripting API in many but not all
cases. Before this change numbers, dates, and IPs supported the `field`
API when running with _source in synthetic mode because they always have
doc values. This change adds support for `match_only_text`, `store`d
`keyword` fields, and `store`d `text` fields. Two remaining field
configurations work with synthetic _source and do not work with `field`:
* A `text` field with a sub-`keyword` field that has `doc_values` * A
`text` field with a sub-`keyword` field that is `store`d

We currently work out whether or not a mapper should be storing additional
values for synthetic source by looking at the DocumentParserContext. However,
this value does not change for the lifetime of the mapper - it is defined by
metadata on the root mapper and is immutable - and DocumentParserContext
feels like the wrong place for this information as it holds context specific
to the document being parsed.
This commit moves synthetic source status information from DocumentParserContext
to MapperBuilderContext instead. Mappers which need this information retrieve
it at build time and hold it on final fields.
This adds support for `ignore_malformed` to numeric fields other than
`scaled_float` in synthetic `_source`. Their values are saved to a
stored field and loaded to render the `_source`.
This makes sure that all field types that support `ignore_malfored` do
so in the same way.
Production changes:
* All mapper has an `ignoreMalformed` method that must return `true` if
the field accepts the `ignore_malformed` mapping parameter was
configured. It defaults to `false` because many fields either don't
have a concept of "malformed" value or don't have the ability to
ignore malformed values.
* Fix the `scaled_float` field to store it's field name in `_ignored` if
it ignores any malfored values. This is how all other field mappers
work.
Test changes:
* `MapperTestCase` forces subclasses to declare if their
`supportIgnoreMalformed` or not.
* If `MapperTestCase` subclasses `supportIgnoreMalfored` they must
define some `exampleMalformedValues`.
* `MapperTestCase` always grows three new tests:
* One that creates a field without setting `ignore_malformed` and
verifies that all `exampleMalformedValues` throw expected errors
* On that explicitly configured `ignore_malformed` to false and, if
`supportIgnoreMalformed` it verifies the errors again. If not
`supportIgnoreMalformed` it verifies that the parameter is unknown.
* On that explicitly configured `ignore_malformed` to true and, if
`supportIgnoreMalformed` it verifies that parsing doesn't produce
errors and correctly produces `_ignored`. If not
`supportIgnoreMalformed` it verifies that the parameter is unknown.
* Moved some subclasesses of `MapperTestCase` from
`internalClusterTests` to `tests`. This isn't strictly required but
that's the right place for them.
This PR reworks the testing conventions precommit plugin. This plugin now:
- is compatible with yaml, java rest tests and internalClusterTest (aka different sourceSets per test type)
- enforces test base class and simple naming conventions (as it did before)
- adds one check task per test sourceSet
- uses the worker api to improve task execution parallelism and encapsulation
- is gradle configuration cache compatible
This also ports the TestingConventions integration testing to Spock and removes the build-tools-internal/test kit folder that is not required anymore. We also add some common logic for testing java related gradle plugins.
We will apply further cleanup on other tests within our test suite in a dedicated follow up cleanup