This change adds access to mapped match_only_text fields via the Painless scripting fields API. The
values returned from a match_only_text field via the scripting fields API always use source as described
by (#81246). These are not available via doc values so there are no bwc issues.
This change adds source fallback support for date and date_nanos by using the existing
SourceValueFetcherSortedNumericIndexFieldData to emulate doc values.
Closes#89414. Remove the workaround from #89135 that addressed #89107,
and instead upgrade the OpenTelemetry API, which contains a fix for the
underlying issue.
This change adds access to mapped text fields via the Painless scripting fields API. The values returned
from a text field via the scripting fields API always use source as described by (#81246). Access via the
old-style through doc will still depend on field data, so there is no change and avoids bwc issues.
Currently, source fallback numeric types do not match doc values numeric types. Source fallback
numeric types de-duplicate numeric values in multi-valued fields. This change removes the de-
duplication for source fallback values for numeric types using value fetchers. This also adds test cases
for all the supported source fallback types to ensure they continue to match their doc values
counterparts exactly.
Adds support for loading `text` and `keyword` fields that have
`store: true`. We could likely load *any* stored fields, but I
wanted to blaze the trail using something fairly useful.
This commit adds support for floating point node.processors setting.
This is useful when the nodes run in an environment where the CPU
time assigned to the ES node process is limited (i.e. using cgroups).
With this change, the system would be able to size the thread pools
accordingly, in this case it would round up the provided setting
to the closest integer.
When calling RuntimeField.parseRuntimeFields() for fields defined in the
search request, we need to wrap the Map containing field definitions in another
Map that supports value removal, so that we don't inadvertently remove the
definitions from the root request. CompositeRuntimeField was not doing this
extra wrapping, which meant that requests that went to multiple shards and
that therefore parsed the definitions multiple times would throw an error
complaining that the fields parameter was missing, because the root request
had been modified.
It's possible for a cluster state update task to emit deprecation
warnings, but if the task is executed in a batch then these warnings
will be exposed to the listener for every item in the batch. With this
commit we introduce a mechanism for tasks to capture just the warnings
relevant to them, along with assertions that warnings are not
inadvertently leaked back to the master service.
Closes#85506
Today if the GCS credentials file setting is invalid we report some kind
of JSON parsing error but it's not clear what JSON is being parsed so
the error is hard to track down. This commit adds the problematic
setting name to the exception message.
Replaces the two arguments to `ClusterStateTaskExecutor#execute` with a
parameter object called `BatchExecutionContext` so that #85525 can add a
new and rarely-used parameter without generating tons of noise.
Adds `WriteScript` as the common base class for the write scripts: `IngestScript`, `UpdateScript`, `UpdateByQueryScript` and `ReindexScript`.
This pulls the common `getCtx()` and `metadata()` methods into the base class and prepares for the implementation of the ingest fields api (https://github.com/elastic/elasticsearch/issues/79155).
As part of the refactor, `IngestScript` now takes a `CtxMap` directly rather than taking "sourceAndMetadata" (`CtxMap`) and `Metadata` (from `CtxMap`). There is a new `getCtxMap()` getter to get the typed `CtxMap`. `getSourceAndMetadata` could have been refactored to do this, but most of the callers of that don't need to know about `CtxMap` and are happy with a `Map<String, Object>`.
Removing the custom dependency checksum functionality in favor of Gradle build-in dependency verification support.
- Use sha256 in favor of sha1 as sha1 is not considered safe these days.
Closes https://github.com/elastic/elasticsearch/issues/69736
Part of #84369. Implement the `Tracer` interface by providing a
module that uses OpenTelemetry, along with Elastic's APM
agent for Java.
See the file `TRACING.md` for background on the changes and the
reasoning for some of the implementation decisions.
The configuration mechanism is the most fiddly part of this PR. The
Security Manager permissions required by the APM Java agent make
it prohibitive to start an agent from within Elasticsearch
programmatically, so it must be configured when the ES JVM starts.
That means that the startup CLI needs to assemble the required JVM
options.
To complicate matters further, the APM agent needs a secret token
in order to ship traces to the APM server. We can't use Java system
properties to configure this, since otherwise the secret will be
readable to all code in Elasticsearch. It therefore has to be
configured in a dedicated config file. This in itself is awkward,
since we don't want to leave secrets in config files. Therefore,
we pull the APM secret token from the keystore, write it to a config
file, then delete the config file after ES starts.
There's a further issue with the config file. Any options we set
in the APM agent config file cannot later be reconfigured via system
properties, so we need to make sure that only "static" configuration
goes into the config file.
I generated most of the files under `qa/apm` using an APM test
utility (I can't remember which one now, unfortunately). The goal
is to setup up a complete system so that traces can be captured in
APM server, and the results in Elasticsearch inspected.
This change converts the range query from an array to object.
```
range": {
"number": [
{
"gte": 4
}
]
}
```
to
```
range": {
"number": {
"gte": 4
}
}
```
This change adds a SourceValueFetcherSortedDoubleIndexFieldData to support double doc values types for source fallback. This also adds support for double, float and half_float field types.
This change adds source fallback support for byte, short, and long fields. These use the already
existing class SourceValueFetcherSortedNumericIndexFieldData.
This removes many calls to the last remaining `createParser` method that
I deprecated in #79814, migrating callers to one of the new methods that
it created.
There were some cases where synthetic source wasn't properly rounding in
round trips. `0.15527719259262085` with a scaling factor of
`2.4206374697469164E16` was round tripping to `0.15527719259262088`
which then round trips up to `0.0.1552771925926209`, rounding the wrong
direction! This fixes the round tripping in this case through ever more
paranoid double checking and nudging.
Closes#88854
Adds metadata classes for Reindex and UpdateByQuery contexts.
For Reindex metadata:
* _index can't be null
* _id, _routing and _version are writable and nullable
* _now is read-only
* op is read-write must be 'noop', 'index' or 'delete'
Reindex metadata keeps the originx value for _index, _id, _routing and _version
so that `Reindexer` can see if they've changed.
If _version is null in the ctx map, or, equivalently, the augmentation
`setVersionToInternal()` was called by the script, `Reindexer` sets document
versioning to internal. If `_version` is `null` in the ctx map, `getVersion`
returns `Long.MIN_VALUE`.
For UpdateByQuery metadata:
* _index, _id, _version, _routing are all read-only
* _routing is also nullable
* _now is read-only
* op is read-write and one of 'index', 'noop', 'delete'
Closes: #86472
This change adds an operation parameter to FieldDataContext that allows us to specialize the field data that are returned from fielddataBuilder in MappedFieldType. Keyword, integer, and geo point field types now support source fallback where we build a doc values wrapper using source if doc values doesn't exist for this field under the operation SCRIPT. This allows us to have source fallback in scripting for the scripting fields API.
The value is `_now` and there was a previous metadata
value `_timestamp` (see test removal in #88733) so the
name is confusing.
Also renames the method `getTimestamp()` to `getNow()`
to reflect the change.
This formats the result of the `fields` section of the `_search` API for
runtime `geo_point` fields using the `format` parameter like we do for
non-runtime `geo_point` fields. This changes the default format for
those fields from `lat, lon` to `geojson` with the option to get `wkt`
or any other format we support.
The fix does so by preserving the `double, double` nature of the
`geo_point` rather than encoding it immediately in the script. Callers can
use the results. The field fetchers use the `double, double` natively,
preserving as much precision as possible. The queries quantize the points
exactly like lucene indexing does. And like the script did before this Pr.
Closes#85245
Allow UpdateByQuery to read the doc version if set in the request via
`version=true`.
If `version=true` is unset or false, the `ctx._version` is `-1`
indicating internal versioning via seq.
Fixes: #55745
MappedFieldType#fieldDataBuilder() currently takes two parameters, a fully qualified
index name and a supplier for a SearchLookup. We expect to add more parameters here
as we add support for loading fielddata from source. Rather than telescoping the
parameter list, this commit instead introduces a new FieldDataContext carrier object
which will allow us to add to these context parameters more easily.
In #88015 we made it so that downloads from S3 would sometimes retry
more than the configured limit, if each attempt seemed to be making
meaningful progress. This causes the failure of some assertions that the
number of retries was exactly as expected. This commit weakens those
assertions for S3 repositories.
Closes#88784Closes#88666
Part of #84369. Split out from #87696. Introduce tracing interfaces in
advance of adding APM support to Elasticsearch. The only implementation
at this point is a no-op class.
This PR adds a new `knn` option to the `_search` API to support ANN search.
It's powered by the same Lucene ANN capabilities as the old `_knn_search`
endpoint. The `knn` option can be combined with other search features like
queries and aggregations.
Addresses #87625
If we run into a seed that causes many fake exceptions and thus retries,
a 100ms retry interval will add up to minutes of test time for tests like
`testLargeBlobCountDeletion` that trigger thousands of requests.
There's no reason not to speed this up by 10x via more aggressive retry
timings as far as I can see so I reduced the timings to avoid randomly
blocked tests.
Adds the `metadata()` API call and a Metadata class for the Update context.
There are different metadata available in the update context depending
on whether it is an update or an insert (via upsert).
For update, scripts can read `index`, `id`, `routing`, `version` and `timestamp`.
For insert, scripts can read `index`, `id` and `timestamp`.
Scripts can always read and write the `op` but the available ops are different.
Updates allow 'noop', 'index' and 'delete'.
Inserts allow 'noop' and 'create'.
Refs: #86472
Currently we have two parameters that control how the source of a document
is stored, `enabled` and `synthetic`, both booleans. However, there are only
three possible combinations of these, with `enabled:false` and `synthetic:true`
being disallowed. To make this easier to reason about, this commit replaces
the `enabled` parameter with a new `mode` parameter, which can take the values
`stored`, `synthetic` and `disabled`. The `mode` parameter cannot be set
in combination with `enabled`, and we will subsequently move towards
deprecating `enabled` entirely.
Create a `Metadata` superclass for ingest and update contexts.
Create a `CtxMap` superclass for `ctx` backwards compatibility in ingest and update contexts. `script.CtxMap` was moved from `ingest.IngestSourceAndMetadata`
`CtxMap` takes a `Metadata` subclass and validates update via the `FieldProperty`s passed in.
`Metadata` provides typed getters and setters and implements a `Map`-like interface, making it easy for a class containing `CtxMap` to implement the full `Map` interface.
The `FieldProperty` record that configures how to validate fields. Fields have a `type`, are `writeable` or read-only, and `nullable` or not and may have an additional validation useful for Set/Enum validation.
Pull out the implementation of `Metadata` from `IngestSourceAndMetadata`.
`Metadata` will become a base class extended by the update contexts: ingest, update, update by query and reindex.
`Metadata` implements a map-like interface, making it easy for a class containing `Metadata` to implement the full `Map` interface.