This updates the gradle wrapper to 8.12
We addressed deprecation warnings due to the update that includes:
- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions
(cherry picked from commit ba61f8c7f7)
* Resolve pipelines from template if lazy rollover write (#116031)
If datastream rollover on write flag is set in cluster state, resolve pipelines from templates rather than from metadata. This fixes the following bug: when a pipeline reroutes every document to another index, and rollover is called with lazy=true (setting the rollover on write flag), changes to the pipeline do not go into effect, because the lack of writes means the data stream never rolls over and pipelines in metadata are not updated. The fix is to resolve pipelines from templates if the lazy rollover flag is set. To improve efficiency we only resolve pipelines once per index in the bulk request, caching the value, and reusing for other requests to the same index.
Fixes: #112781
* Remute tests block merge
* Remute tests block merge
* Use directory name as project name for libs (#115720)
The libs projects are configured to all begin with `elasticsearch-`.
While this is desireable for the artifacts to contain this consistent
prefix, it means the project names don't match up with their
directories. Additionally, it creates complexities for subproject naming
that must be manually adjusted.
This commit adjusts the project names for those under libs to be their
directory names. The resulting artifacts for these libs are kept the
same, all beginning with `elasticsearch-`.
* fixes
It is English in the docs, so this fixes the code to match the docs. Note that this really impacts Elasticsearch when run on JDK 23 with the CLDR locale database, as in the COMPAT database pre-23, root and en are essentially the same.
The addition of the logger requires several updates to tests to deal with the possible warning, or muting if there is not way to specify an allowed (but not mandatory) warning
This PR starts tracking raw ingest and storage size separately for updates by document.
This is done capturing the ingest size when initially parsing the update, and storage size when
parsing the final, merged document.
Additionally this renames DocumentSizeObserver to XContentParserDecorator / XContentMeteringParserDecorator
for better reasoning about the code. More renaming will have to follow.
---------
Co-authored-by: Przemyslaw Gomulka <przemyslaw.gomulka@elastic.co>
Deprecates for removal the following methods from `ClusterAdminClient`:
- `prepareSearchShards`
- `preparePutStoredScript`
- `prepareDeleteStoredScript`
- `prepareGetStoredScript`
Also replaces all usages of these methods with more suitable test
utilities. This will permit their removal, and the removal of the
corresponding `RequestBuilder` objects, in a followup.
Relates #107984
Reconstruct indices set in BulkRequest constructor so that the correct thread pool can be used for forwarded bulk requests. Before this fix, forwarded bulk requests were always using the system_write thread pool because the indices set was empty.
Fixes issue https://github.com/elastic/elasticsearch/issues/102792
Addresses the performance issue in the date ingest processor where Mustache template evaluation is unnecessarily applied inside a loop. The timezone and locale templates are now evaluated once before the loop, improving efficiency.
closes#110191
---------
Co-authored-by: Joe Gallo <joegallo@gmail.com>
this commit refactors the metering for billing api so that we can hide the implementation details of DocumentSizeObserver creation and adds additional field `originatesFromScript` on IndexRequest
There will no longer need to have a code checking if the request was already parsed in ingest service or updatehelper. This logic will be hidden in the implementation.
in order to decided what logic in to apply when reporting a document size we need to know if an index is a time_series mode. This information is in indexSettings.mode.
Change the ingest byte stats to always be returned
whether or not they have a value of 0. Add human readable
form of byte stats. Update docs to reflect changes.
Add test confirming that pipelines are run after a reroute.
Fix test of two stage reroute. Delete pipelines during teardown
so as to not break other tests using name pipeline name.
Co-authored-by: Joe Gallo <joegallo@gmail.com>
Current ingest byte stat fields could easily be confused.
Add more descriptive name to make it clear that they do not
count all docs processed by the pipeline.
Add ingested_in_bytes and produced_in_bytes stats to pipeline ingest stats.
These track how many bytes are ingested and produced by a given pipeline.
For efficiency, these stats are recorded for the first pipeline to process a
document. Thus, if a pipeline is called as a final pipeline after a default pipeline,
as a pipeline processor, and after a reroute request, a document will not
contribute to the stats for that pipeline. If a given pipeline has 0 bytes recorded
for both of these stats, due to not being the first pipeline to run any doc, these
stats will not appear in the pipeline's entry in ingest stats.
RAmetric can be implemented so that they could be reported before they are being indexed (like with a new field being added)
or they could be accumulated and reported upon shard commit as an additional metadata
This commit addes new method to DocumentSizeReporter#onParsingComplted DocumentSizeAccumulator that is being used to accumulate the size inbetween the commits DocumentSizeReporter can be parametrised with a DocumentSizeAccumulator
based on #108449
previously DocumentSizeReporter was reporting upon indexing being completed in TransportShardBulkAction#onComplete
This commit renames the method to onIndexingCompleted and moves that reporting to IndexEngine in serverless plugin.
This will be followed up in a separate PR that will be reporting in an Engine#index subclass (serverless)
PipelineProcessors with non-existing pipelines should succeed (as noop)
if ignore_missing_pipeline=true. Currently, does not work when pipelines are
simulated with verbose=true. In this case, an error is returned and no results
are shown for subsequent processors. This change allows following processors
to run, and changes the status from error to error_ignored.
The size of the document used to trigger a StackOverflowError in
GsubProcessorTests.testStackOverflow() was just large enough to cause it
on a mac. On the linux CI boxes, occasionally it does not cause a
StackOverflowError, and as a result the test fails. This change makes
the document more than 3x larger, making a StackOverflowError
guaranteed. Closes#107416
To simplify the migration away from version based skip checks in YAML specs,
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.
New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
For system indices we don't want to emit metrics. DocumentSizeReporter will be created given an index. It will internally contain a SystemIndices instance that will verify the indexName with isSystemName
Just a suggetion. I think this would save us a bit of memory here and
there. We have loads of places where the always true lambdas are used
with `Predicate.or/and`. Found this initially when looking into field
caps performance where we used to heavily compose these but many spots
in security and index name resolution gain from these predicates.
The better toString also helps in some cases at least when debugging.
The `uri_parts` processor was behaving incorrectly for
URI's that included a dot in the path but did not have an extension.
Also includes YAML REST tests for the same.
udpate request that are sending a document (or part of it) should allow for metering the size of that doc
the update request that are using a script should not be metered - reported size 0.
this commit is following up on #104859
The parsing is of the update's document is being done in UpdateHelper - the same pattern we use to meter parsing in IngestService. If the script is being used, the size observed will be 0.
The value observed is then reported in the TransportShardBulkAction and thanks to the value being 0 or positive it will not be metering the modified document again.
This commit also renames the getDocumentParsingSupplier to getDocumentParsingProvider (this was accidentally omitted in the #104859)
* Use single-char variant of String.indexOf() where possible
indexOf(char) is more efficient than searching for the same one-character String.
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>