Commit graph

153 commits

Author SHA1 Message Date
Rene Groeschke
4d17b2193a
Update Gradle wrapper to 8.12 (#118683) (#119357)
This updates the gradle wrapper to 8.12

We addressed deprecation warnings due to the update that includes:

- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions

(cherry picked from commit ba61f8c7f7)

# Conflicts:
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/distribution/DockerCloudElasticsearchDistributionType.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/distribution/DockerUbiElasticsearchDistributionType.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/Fixture.java
#	plugins/repository-hdfs/hadoop-client-api/build.gradle
#	server/src/main/java/org/elasticsearch/inference/ChunkingOptions.java
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/migrate/build.gradle
#	x-pack/plugin/security/qa/security-basic/build.gradle
2024-12-31 08:37:28 +01:00
Rene Groeschke
581b9ab7c0
[8.16] [Gradle] Remove static use of BuildParams (#115122) (#117434)
* [Gradle] Remove static use of BuildParams (#115122)

Static fields dont do well in Gradle with configuration cache enabled.

- Use buildParams extension in build scripts
- Keep BuildParams.ci for now for easy serverless migration
-  Tweak testing doc

(cherry picked from commit 13c8aaeffa)

# Conflicts:
#	TESTING.asciidoc
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/InternalDistributionBwcSetupPlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/RestTestBasePlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/compat/compat/AbstractYamlRestCompatTestPlugin.java
#	build.gradle
#	modules/ingest-geoip/qa/full-cluster-restart/build.gradle
#	qa/mixed-cluster/build.gradle
#	x-pack/plugin/ent-search/qa/full-cluster-restart/build.gradle
#	x-pack/plugin/eql/qa/rest/build.gradle
#	x-pack/plugin/fleet/qa/rest/build.gradle
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/mapper-unsigned-long/build.gradle
#	x-pack/plugin/ml/qa/multi-cluster-tests-with-security/build.gradle
#	x-pack/plugin/security/qa/multi-cluster/build.gradle
#	x-pack/plugin/sql/qa/jdbc/build.gradle
#	x-pack/plugin/transform/qa/multi-cluster-tests-with-security/build.gradle

* Fix merge

* [Build] Fix fips testing after buildparams rework (#116934)

* More Cleanup

* [Build] Fix checkstyle exclusions on windows (#115185)

* More merge fixes

* Delete x-pack/plugin/kql/build.gradle
2024-11-27 12:34:32 +01:00
Jan Kuipers
6f3d15296a
Propagate scoring function through random sampler (#116957) (#117165)
* Propagate scoring function through random sampler.

* Update docs/changelog/116957.yaml

* Correct score mode in random sampler weight

* Fix random sampling with scores and p=1.0

* Unit test with scores

* YAML test

* Add capability
2024-11-21 03:03:55 +11:00
Oleksandr Kolomiiets
17022fdefc
[8.x] Allow stored source in logsdb and tsdb (#114454) (#114648)
* Allow stored source in logsdb and tsdb (#114454)

(cherry picked from commit a62228a744)

# Conflicts:
#	modules/aggregations/build.gradle
#	modules/data-streams/src/javaRestTest/java/org/elasticsearch/datastreams/logsdb/LogsIndexModeCustomSettingsIT.java
#	rest-api-spec/build.gradle

* Fix tests

* Fix tests

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-10-15 12:07:27 -07:00
David Turner
98209e44de
Simplify XContent output of epoch times (#114491) (#114736)
Today the overloads of `XContentBuilder#timeField` do two rather
different things: one formats an object as a `String` representation of
a time (where the object is either an unambiguous time object or else a
`long`) and the other formats only a `long` as one or two fields
depending on the `?human` flag.

This is trappy in a number of ways:

- `long` means an absolute (epoch) time, but sometimes folks will
  mistakenly use this for time intervals too.

- `long` means only milliseconds, there is no facility to specify a
  different unit.

- the dependence on the `?human` flag in exactly one of the overloads is
  kinda weird.

This commit removes the confusion by dropping support for considering a
`Long` as a valid representation of a time at all, and instead requiring
callers to either convert it into a proper time object or else call a
method that is explicitly expecting an epoch time in milliseconds.
2024-10-15 03:46:31 +11:00
Ignacio Vera
500b9c3926
Run fail formatting yaml test with 1 shard (#114214) (#114356)
# Conflicts:
#	muted-tests.yml

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-10-09 08:18:58 +11:00
Simon Cooper
1b01548425
[8.16] Change default locale of date mappers to ENGLISH (#112799) (#114210)
Backport #112799 to 8.16, for CLDR locale compatibility
2024-10-07 15:51:36 +01:00
Mark Vieira
0279c0a909
Add AGPLv3 as a supported license 2024-09-13 14:30:33 -07:00
Mark Vieira
24f33e95e8
Ensure rest compatibility tests are run when appropriate (#112526) 2024-09-05 08:22:48 -07:00
Armin Braun
bf7be8e23a
Save 400 LoC in tests by using indexSettings shortcut (#111573)
It's in the title, randomly saw a bunch of spots where we're
not using the shortcut, figured I'd clean this up quickly to save ~400 lines.
2024-08-05 10:21:13 +02:00
Simon Cooper
32c21beb3f
Collapse transport versions for 8.14.0 (#111199) 2024-07-24 10:40:35 +01:00
Armin Braun
10e2cc3c11
Dry up Bucket types in o.e.search.aggregations.bucket.histogram (#110303)
It's in the title, lots of duplication here that deserves cleanup
in isolation. Also, bucket instances are a perpetual source of
memory consuption in aggs. There are lots of possible improvements
we can make to reduce their footprint, drying up this code enables
cleaner PRs for these improvements.
2024-07-01 15:10:02 +02:00
Alexander Spies
2876e059f3
Aggs: Improve scripted metric agg allow list tests (#110153)
* Add an override to the aggs tests to override the allow list default setting. This makes it possible to run the scripted metric aggs tests on Serverless, even when we disallow these aggs per default on Serverless.
* Move the allow list tests next to the scripted metric tests since these belong together.
2024-06-28 11:47:30 +02:00
Armin Braun
c856314ea2
Make empty aggregation instances a little cheaper (#110190)
For sub-aggregations we were not properly sizing the array. Also we can
use singleton lists in a couple spots to save a little more memory.
2024-06-27 00:51:29 +10:00
Martijn van Groningen
ac6c0eecc1
Ensure synthetic source and dv codec are enabled with logs index mode (attempt 2). (#109382)
This was initially muted via #109365, because of a failing newly introduced assert.

Original PR #109269
2024-06-05 17:32:14 +02:00
Oleksandr Kolomiiets
f1153b1f8d
Revert "Ensure synthetic source and dv codec are enabled with logs index mode. (#109269)" (#109365)
This reverts commit 4161e4d2e2.
2024-06-04 12:45:08 -07:00
Martijn van Groningen
4161e4d2e2
Ensure synthetic source and dv codec are enabled with logs index mode. (#109269)
After running the elastic/logs track with logs index mode enabled, I noticed that _source was still getting stored.
The issue was that other index modes than time_series weren't propagated to Indexmetadata and IndexSettings classes. Additionally the synthetic source defaults in SourceFieldMapper were geared towards time series index mode only. This change addresses this.
2024-06-04 16:06:19 +02:00
Moritz Mack
b71fc0c561
Migrate remaining usage of skip version in YAML specs to cluster_features (#108055) 2024-05-07 09:42:17 +02:00
Ignacio Vera
eea94ae66f
Optimise histogram aggregations for single value fields (#107893)
This commit optimise histogram aggregations for single value fields.
2024-05-02 07:30:55 +02:00
eyalkoren
ee262954ee
Adding aggregations support for the _ignored field (#101373)
Enables aggregations on the _ignored metadata field replacing the stored field
with doc values.
2024-04-29 16:41:34 +02:00
Simon Cooper
f53f06ea88
Define transport version constant for 8.13 (#107951) 2024-04-29 09:53:51 +01:00
Ignacio Vera
f2fe71b938
Optimise time_series aggregation for single value fields (#107990)
ime series dimensions are by definition single value field. Therefore let's take advantage of that property in time-series 
aggregation and stop trying to iterate over dimension doc values. This change might bring better performance.
2024-04-29 08:54:04 +02:00
Ignacio Vera
04bf642d9f
Validate stats formatting in standard InternalStats constructor (#107678)
We want to validate stats formatting before we serialize to XContent, as chunked x-content serialization
 assumes that we don't throw exceptions at that point. It is not necessary to do it in the StreamInput constructor 
as this one has been serialise from an already checked object.

This commit adds starts formatting validation to the standard InternalStats constructor.
2024-04-24 16:30:21 +02:00
Ignacio Vera
78cba460a4
Increase size of big arrays only when there is an actual value in the aggregators (#107764)
During aggregation collection, we use BigArrays to hold the values on a compact way for metrics aggregations. 
We are currently resizing those arrays whenever the collect method is call, regardless if there is an actual value
 in the provided doc.

This can be wasteful for sparse fields as we might never have a value but still we are resizing those arrays.

Therefore this commit moves the resize after checking that there is a value in the provided document.
2024-04-24 10:24:35 +02:00
Armin Braun
417c4d1505
Minor cleanup Aggregations Module (#107694)
Just some random finds of unused code from researching memory things.
2024-04-22 16:28:05 +02:00
Moritz Mack
1f5e04b721
Migrate YAML REST tests to synthetic cluster feature check (#107068)
To simplify the migration away from version based skip checks in YAML specs, 
this PR adds a synthetic version feature `gte_vX.Y.Z` for any version at or before 8.14.0.

New test specs for 8.14 or later are expected to use respective new cluster features,
or a test-only feature supplied via ESRestTestCase#createAdditionalFeatureSpecifications
if sufficient.
2024-04-11 18:22:38 +02:00
Ignacio Vera
de171b8f88
Use merge sort instead of hashing to avoid performance issues with many buckets (#107218) 2024-04-10 08:35:09 +02:00
Ignacio Vera
47dbd611b7
Refactor MultiBucketAggregatorsReducer and DelayedMultiBucketAggregatorsReducer (#106725)
renamed to BucketReducer DelayedBucketReducer and they have a new property containing the prototype bucket
2024-03-26 08:02:11 +01:00
Yang Wang
12e04e12da
Explicitly set number_of_shards to 1 in tests (#106707)
Some tests rely on the default number_of_shards to be 1. This may not
hold if the default number_of_shards changes. This PR removes that
assumption in the tests by explicitly configuring the number_of_shards
to 1 at index creation time.

Relates: #100171
Relates: ES-7911
2024-03-25 19:13:35 +11:00
Martijn van Groningen
ac4e2f43b7
Small time series agg improvement (#106288)
After tsid hashing was introduced (#98023), the time series aggregator generates the tsid (from all dimension fields) instead of using the value from the _tsid field directly. This generation of the tsid happens for every time serie, parent bucket and segment combination.

This changes alters that by only generating the tsid once per time serie and segment. This is done by just locally recording the current tsid.
2024-03-13 17:03:13 +01:00
Ignacio Vera
1e445d9a4f
throw IllegalArgumentException instead if AggregationInitializationException when adding a sub-aggregation to a metric aggregation (#106074)
This to avoid returning code 500 in such cases.
2024-03-07 17:17:11 +01:00
Ignacio Vera
2ba37ffc38
Reduce InternalAdjacencyMatrix in a streaming fashion (#105751) 2024-02-23 07:54:41 +01:00
Ignacio Vera
f396321d08
Reduce InternalAutoDateHistogram in a streaming fashion (#105740) 2024-02-22 16:10:03 +01:00
Ignacio Vera
fd17e0c8aa
Reduce InternalMatrixStats in a streaming fashion (#105389) 2024-02-15 08:49:39 +01:00
Dmitry Cherniachenko
e21a4874ab
Use String.replace() instead of replaceAll() for non-regexp replacements (#105127)
* Use String.replace() instead of replaceAll() for non-regexp replacements

When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.
2024-02-12 13:11:15 -05:00
Ignacio Vera
8f37ef977f
Remove abstract method InternalMultiBucketAggregation#reduceBucket (#105275) 2024-02-08 11:24:02 +01:00
Ignacio Vera
609e8059eb
Introduce an AggregatorReducer to reduce the footprint of aggregations in the coordinating node (#105207)
This commit adds an abstraction that performs reduction of InternalAggregations in a streaming fashion.
2024-02-08 09:30:54 +01:00
Ignacio Vera
4d5416912b
Use an AbstractList to build the AggregationList for reduction (#105200)
We are building a list of InternalAggregations from a list of Buckets, therefore we can use an AbstractList to create the actual list and save some allocations.
2024-02-06 17:53:41 +01:00
Martijn van Groningen
39eefb3197
Unmute TimeSeriesTsidHashCardinalityIT (#105121)
and reduce the number of time series in order to fix test related OOME.

Relates to #105104
2024-02-05 17:20:30 +01:00
Nhat Nguyen
40a61abb95 Awaits fix #105104 2024-02-03 18:34:03 -08:00
Salvatore Campagna
bdd3a4ffbe
Hash the tsid to overcome dimensions limits (#98023)
A Lucene limitation on doc values for UTF-8 fields does not allow  us to
write keyword fields whose size is larger then 32K. This limits  our
ability to map more than a certain number of dimension fields  for time
series indices. Before introducing this change the tsid is created as a
catenation of dimension field names and values into a keyword field.

To overcome this limitation we hash the tsid. This PR is intended to be
used as a draft to test different options.

Note that, as a side effect, this reduces the size of the tsid field as
a result of storing far less data when the tsid is hashed. Anyway, we
expect tsid hashing to affect compression of doc values and resulting in
larger storage footprint. Effect on query latency needs to be evaluated
too.

Resolves #93564
2024-02-01 08:25:31 -05:00
Martijn van Groningen
81a49f1567
Restrict usage of certain aggregations when in sort order execution is required (#104665)
A number of aggregations that rely on deferred collection don't work
with time series index searcher and will produce incorrect result. These
aggregation usages should fail. The documentation has been updated to
describe these limitations.

In case of multi terms aggregation, the depth first collection is
forcefully used when time series aggregation is used. This behaviour is
inline with the terms aggregation.
2024-02-01 07:09:17 -05:00
Ignacio Vera
7c8bb145f1
Merge Aggregations into InternalAggregations (#104896)
This commit merges Aggregations into InternalAggregations in order to remove the unnecessary hierarchy.
2024-01-30 14:31:21 +01:00
Ignacio Vera
b07595c777
HasAggregations#getAggregations return InternalAggregators instead of Aggregators (#104864) 2024-01-29 17:00:11 +01:00
Ignacio Vera
79f801b754
Remove unused ParsedAggregation (#104848)
This abstraction was introduced to support the high level rest client and is not needed any more.
2024-01-29 14:59:11 +01:00
Ignacio Vera
72fe8b30d5
Remove ParsedAggregation from tests (#104790)
This commit removes any reference to ParsedAggregation from the test framework
2024-01-29 07:46:06 +01:00
Armin Braun
fc2bdc2fe3
Build sub aggregation buckets more lazily (#104762)
Build these more lazily avoiding putting them in an array and don't keep
an accidental reference to the aggregator itself.
2024-01-25 10:45:01 -05:00
Armin Braun
a72524a996
Remove some more unused XContent parsers (#98070)
Remove some more of these parsers that have become obsolete with the
HLRC going away.
2024-01-08 15:38:18 +01:00
David Turner
af916a0549
Rename StreamInput#readGenericMap (#104045)
`StreamInput#readMap()` is quite different from the other `readMap`
overloads, and pairs up with `StreamOutput#writeGenericMap`. This commit
renames it to avoid accidental misuse and so that the names line up
better between writer and reader.
2024-01-08 08:05:53 -05:00
Martijn van Groningen
850e88c589
Minor cleanup of org.elasticsearch.aggregations package (#102622)
* Convert Collections.sort() to List.sort()
* Use Map.computeIfAbsent()
* Use primitive double over Double
* Replace some lambdas with method references.
* Replaces for loops with index variable to use foreach iteration style for loops.
2023-11-28 08:08:01 +01:00