Commit graph

75574 commits

Author SHA1 Message Date
Przemysław Witek
b1fb4100dd
[Transform] Do not log deduction-related warnings for transforms with disabled mapping deduction (#105138) 2024-02-06 10:07:35 +01:00
Larisa Motova
24a89e37c4
Fix write index resolution with aliases to TSDS's (#104440)
Currently when you try to index a document to a TSDS via an alias the
alias resolves to the latest backing index of the TSDS. This commit
delegates finding the write index to the original data stream the alias
points to.

Fixes #104189
2024-02-06 03:18:05 -05:00
Daniel Mitterdorfer
59946ed8f2
[Profiling] Always allow for CO2 and cost defaults (#105173)
There are two possibilities to retrieve flamegraph data:

* Via the native UI
* Via the APM integration

Depending on the scenario, different request parameters are set. While
we have improved the CO2 and cost calculation for the native UI, the
host id, which is required for an improved CO2 and cost calculation, is
not yet available for the APM integration.

So far we've not performed this calculation at all because there were
no associated host data for stacktraces. Consequently, we've returned
zero values in all cases. With this commit we associate "dummy" host
data so the CO2 and cost calculation falls back to default values. Once
a host id is available for that case as well, we will instead use the
improved calculations.
2024-02-06 08:45:06 +01:00
Ryan Ernst
2a298a7acc
Add replay diagnostic dir to system jvm options (#103535)
When hotspot encounters an error, it will emite a log file which can be
used for reproducing the error. This file is dumped to /tmp by default.
This commit configures the replay file to be alongside the hs_err file.
2024-02-05 20:58:53 -08:00
Ryan Ernst
b1e2cb8040
Deprecate client.type (#104574)
client.type existed from the days of the node client existing alongside
the java client. Since the node client no longer exists, it no longer
serves a purpose, and is already ignored. Yet the setting still exists.
This commit deprecates the client.type node setting.
2024-02-05 23:15:46 -05:00
Ryan Ernst
e8c2f4ffed
Provision min runtime version jdk for compilation (#105152)
This commit adjusts compile tasks to explicitly provision a Java
toolchain for the Java minimum runtime version. By doing so the Java
used by Gradle may be upgraded without the possibility of causing
spurious warnings from javac which could fail the build, such as when
new warnings are added in later JDK versions.
2024-02-05 19:19:17 -08:00
Costin Leau
ac09d75078
ESQL: Extend STATS command to support aggregate expressions (#104958)
Previously only aggregate functions (max/sum/etc..) were allowed inside
 the stats command. This PR allows expressions involving one or multiple
 aggregates to be used, such as:
 stats x = avg(salary % 3) + max(emp_no),
       y = min(emp_no / 3) + 10 - median(salary)
       by z = languages % 2

Improve verifier to not allow scalar functions over grouping for now
2024-02-05 19:08:10 -08:00
Nhat Nguyen
af163b2e04
Fix LuceneSourceOperatorStatusTests (#105169)
Closes #103774
2024-02-05 19:49:56 -05:00
Ryan Ernst
313c63681f
Adjust adoptium download url (#105161)
The url for downloading JDKs from adoptium appears to use the semver
version, not the openjdk version. I encountered this with a windows
build. The current url was
https://api.adoptium.net/v3/binary/version/jdk-17.0.9+9/windows/x64/jdk/hotspot/normal/eclipse?project=jdk
which returns a version not found error, while
https://api.adoptium.net/v3/binary/version/jdk-17.0.9+9.1/windows/x64/jdk/hotspot/normal/eclipse?project=jdk
correctly downloads the jdk.
2024-02-05 19:18:38 -05:00
Nhat Nguyen
40e0f1f817
Field-caps should read fields from up-to-dated shards (#105153)
I have seen scenarios in which field-caps return information from 
outdated shards. While this is probably acceptable for most cases, ESQL
query planning relies on field-caps, potentially leading to missing
data. The reason for this is that we don't check readAllowed when not
acquiring a searcher for cases without a filter. I don't expect too much
penalty in terms of performance with this change, but it helps avoid a
subtle issue for ESQL.

Closes #104809
2024-02-05 15:34:55 -08:00
Jonathan Buttner
e631f76017
Mute failing tests (#105156)
This PR mutes a couple tests that are flaky from a recent PR merge:
https://github.com/elastic/elasticsearch/pull/105037

For this issue: https://github.com/elastic/elasticsearch/issues/105155
2024-02-05 15:39:51 -05:00
Brian Seeders
37b57a411b
[ci] Allow CI to be triggered by old elasticmachine-style comment (#105154) 2024-02-05 15:38:19 -05:00
Ryan Ernst
b250f06b09
Add a gradle plugin for embedded providers (#105094)
x-content embeds its jackson implementation inside its jar. This commit
formalizes the setup for this embedding with a gradle plugin so that it
can be reused by other libs.
2024-02-05 15:21:52 -05:00
Keith Massey
67e9233e82
Cleaning up the new ingest builders (#105149)
Changing ingest builders to only hold a single representation for any given request field.
2024-02-05 13:50:56 -06:00
Jonathan Buttner
957419c164
[ML] Setting the request service queue capacity and allow it to be adjusted (#105037)
* Bounding queue capacity and allowing it to be adjusted

* Adding some deadlock tests

* Adding some more tests for the request executor and queue logic

* Adding debug message

* Retaining overflow items after capacity change

* Addressing feedback
2024-02-05 14:46:34 -05:00
James Baiera
9d3a645d59
Redirect failed ingest node operations to a failure store when available (#103481)
This PR updates the ingest service to detect if a failed ingest document was bound for a data stream configured 
with a failure store, and in that event, restores the document to its original state, transforms it with its failure 
information, and redirects it to the failure store for the data stream it was originally targeting.
2024-02-05 14:37:30 -05:00
Armin Braun
f879508834
Avoid building large CompositeByteBuf when sending transport messages (#105137)
We can avoid building composite byte buf instances on the transport
layer (they have quite a bit of overhead and make heap dumps more
complicated to read). There's no need to add another round of references
to the BytesReference components here. Just write these out as they come
in. This would allow for some efficiency improving follow-ups where we
can essentially release the pages that have passed the write pipeline.
To avoid having this explode the size of the queue for writes per
channel, I moved that to a linked list. The slowdown from a linked list
is irrelevant I believe. Mostly the queue is empty so it doesn't matter
or if it isn't empty, operations other than dequeuing are much more
important to performance in this logic anyway (+ Netty internally uses a
LL down the line anyway).

I would regard this as step-1 in making the serialisation here more lazy
like on the REST layer to avoid copying bytes to the outbound buffer
that we already have as `byte[]`.
2024-02-05 14:35:15 -05:00
Jonathan Buttner
fabcf70883
Switching evictor tests to use a deterministic queue (#105151) 2024-02-05 13:54:43 -05:00
Benjamin Trent
43362d5de5
Add new int8_flat and flat vector index types (#104872)
This adds two new vector index types:  - flat   - int8_flat

Both store the vectors in a flat space and search is brute-force over
the vectors in the index.   For the regular `flat` index, this can be
considered syntactic sugar that allows `knn` queries without having to
put indices within HNSW. 

For `int8_flat`, this allows float vectors to be stored in a flat
manner, but also automatically quantized.
2024-02-05 12:56:13 -05:00
Martijn van Groningen
4376bdb2f1
Adjust skip version for tsdb bwc tests that rely on _id / _tsid (#105144)
Yaml tests executed in mixed clusters need to skip clusters that run 8.12.x or earlier versions. The yaml tests assume hashing based time series ids, but if a node in the test cluster is on 8.12.x or earlier, then it can happen pre hashing time series ids are used (depending on the version of the elected master node).

Tsdb yaml tests that assert the _id or _tsid should be skipped if there are 8.12.x nodes in the mixed test cluster.
Rolling upgrade or full upgrade tests are better for assertion the _id or _tsid in this case, because tests are setup prior to upgrade and pre 8.12.x logic can be asserted in a more controlled way.

Closes #105129
2024-02-05 18:08:08 +01:00
Pat Whelan
2932500ce2
[Transform] return results in order (#105089)
* Transform: return results in order

Currently, when Transform searches over aggregations, it stores the
results in an unordered HashMap. This potentially rearranges search
results.

For example, if a user specifies an order in a search request, the
search response is in that order. But if the search request is
embedded in a Transform request, then Transform response will not
preserve the order and the result will look different.

With this change, Transform will always preserve the order of the
search response. A search embedded in a Transform should behave as
an unembedded search.

Closes #104847
2024-02-05 11:57:20 -05:00
Martijn van Groningen
39eefb3197
Unmute TimeSeriesTsidHashCardinalityIT (#105121)
and reduce the number of time series in order to fix test related OOME.

Relates to #105104
2024-02-05 17:20:30 +01:00
Keith Massey
617dad5d36
Reducing the memory usage of the new IndexRequestBuilder (#105091) 2024-02-05 09:32:52 -06:00
Nhat Nguyen
86c1fa2a6c
Avoid convert to string when parse resp in heap attack (#105109)
We've seen cases of OOM errors in the test runner process, which occur 
when we convert a response to a JSON string and then parse it. We can
directly parse from its input stream to avoid these OOM errors.
2024-02-05 07:16:25 -08:00
David Kilfoyle
6ae521bf12
[DOCS] Small fixes for the 'Installing Elasticsearch' page (#105034)
* [DOCS] Add link to on-prem install tutorial

* Move link to bottom of packages section

* Rearrange things according to suggestions

* Add another link on the 'Install Elasticsearch with RPM' page
2024-02-05 10:06:41 -05:00
Jedr Blaszyk
6054ca36cf
[Connector API] Support filtering by name, index name in list action (#105131) 2024-02-05 15:47:22 +01:00
Michael Peterson
a34174c224
Query timeouts should not be return 500 INTERNAL_SERVER_ERROR status code (#104868)
Created new Exception QueryPhaseTimeoutException, which returns RestStatus 504.

We considered the 408 status code, but decided that the official spec for that status doesn't
match this scenario, so 504 was considered the closest fit.
2024-02-05 08:44:43 -05:00
Jedr Blaszyk
62dc143ab5
[Connectors API] Fix bug with crawler configuration parsing and sync_now flag (#105024) 2024-02-05 14:37:36 +01:00
Ievgen Degtiarenko
e75ca48ece
Fix testListTasksWaitForCompletion (#104391)
This change attempts to fix testListTasksWaitForCompletion by setting bariers to
 verify task started on all nodes and had a chance to list all running tasks
 before canceling the TEST_TASK
2024-02-05 14:11:46 +01:00
Armin Braun
9f2d38856d
Simplify EQL logic that references SearchHit (#105060)
I tried to move this logic to use pooled SearchHit instances but it
turned out to be too complicated in one go, so simplifying obvious spots
here:

* ReversePayload is pointless, it just reverses the original payload.
* a number of listeners were unnecessary and could be expressed inline
  much clearer
* moved some "unpooling" to later in the logic to make objects live for
  shorter and have fewer references to them
2024-02-05 11:56:36 +01:00
Dmitry Cherniachenko
9bc2a7045e
Minor grammar fixes (StreamInput.java) (#99857) 2024-02-05 10:48:41 +00:00
Martijn van Groningen
d6bbfc53bb
Unmute DownsampleActionIT#testRollupNonTSIndex(...) (#105116)
and add more logging for when test fails next time.

Relates to #103981
2024-02-05 11:41:49 +01:00
David Kyle
2a39c32fb0
Mute IndexRecoveryIT testDoNotInfinitelyWaitForMapping (#105125)
For #105122
2024-02-05 05:38:39 -05:00
Yang Wang
931f2c48c9
[Docs] Fix a doc bug for Flush API's force parameter (#105112)
The force parameter defaults to false instead of true.
2024-02-05 21:06:09 +11:00
David Turner
6a40c04cc1
More guidance in balance settings docs (#105119)
Today the docs on balancing settings describe what the settings all do
but offer little guidance about how to configure them. This commit adds
some extra detail to avoid some common misunderstandings and reorders
the docs a little so that more commonly-adjusted settings are mentioned
earlier.
2024-02-05 05:04:24 -05:00
Armin Braun
71c3f34ce5
Speedup slicing from ReleasableBytesReference some more (#105108)
We can speed up the slice operation quite a bit by speeding up skip for
the common case and passing the delegete as the basis for the stream (this neatly avoids
a multi-morphic call to `length` on the bytes reference).
Also, while we're at it, we can speed up the common-case read operation
the same way.
2024-02-05 10:19:53 +01:00
Jan Kuipers
b328982e37
Refactor DataExtractor summary (#105011)
* Empty getDataSummary methods

* Move chunking logic from DataSummary to Chunker.

* Move DataSummary to ScrollDataExtractor.

* Move DataSummary to AbstractAggregationDataExtractor.

* Remove ununsed code

* Make ChunkedDataExtractorContext a record

* Remove more unused code

* Add tests for getSummary()

* Implement CompositeAggregationDataExtractor::getSummary().

* More unused code

* Move shared code to DataExtractorQueryContext.

* Move shared code to DataExtractorUtils.

* Lint fixes

* Move checkForSkippedClusters to DataExtractorUtils

* Replace monkey patching by ArgumentCaptor.

* Add checkForSkippedClusters to AbstractAggregationDataExtractor::executeSearchRequest

* Fix DataSummary for rollups

* Add documentation
2024-02-05 10:08:12 +01:00
Kostas Krikellas
e85bb5afc3
Nest pass-through objects within objects (#105062)
* Fix test failure

https://gradle-enterprise.elastic.co/s/icg66i6mwnjoi

* Fix test failure

https://gradle-enterprise.elastic.co/s/icg66i6mwnjoi

* Nest pass-through objects within objects

* Update docs/changelog/105062.yaml

* improve test
2024-02-05 09:31:13 +02:00
Yang Wang
552d2f563b
Expose OperationPurpose via CustomQueryParameter to s3 logs (#105044)
This PR adds the OperationPurpose as a custom query parameter for each
S3 request so that they are available in s3 access logs.

Resolves: ES-7750
2024-02-04 03:21:50 -05:00
Ryan Ernst
e1488a0fc7
Fix compilation of example rest handler (#105101) 2024-02-03 20:48:11 -08:00
Nhat Nguyen
40a61abb95 Awaits fix #105104 2024-02-03 18:34:03 -08:00
Keith Massey
64a790f8fe
Modifying ingest request builders (#104636)
This changes all of our ingest-related builders other than BulkRequestBuilder
(which was already changed in #104927) to inherit from ActionRequestLazyBuilder
(added in #104927) rather than ActionRequestBuilder. This means that the
requests will not be created until the builder's request() method is called, making
upcoming ref counting work much more feasible.
2024-02-02 15:53:44 -06:00
Armin Braun
7d9253ed35
Avoid creating map allocator when writing empty maps to StreamOutput (#105071)
It's in the title. We already have the size here and allocating the
iterator isn't free. In fact it's 10G of allocations during http_logs
indexing that we can avoid with a simple condition.
2024-02-02 20:50:26 +01:00
Moritz Mack
54088839b4
Do not enable APM agent 'instrument', it's not required for manual tracing. (#105055) 2024-02-02 18:13:00 +01:00
Slobodan Adamović
8b7c777b54
Validate settings before reloading JWT shared secret (#105070)
This PR adds missing validation before reloading JWT shared secret settings. 
The shared secret setting must always be configured when the client
authentication type is `shared_secret` and omitted when it's `none`.
2024-02-02 17:37:14 +01:00
Przemysław Witek
da7bed3584
[ML] Fix handling of ml.config_version node attribute (#105066) 2024-02-02 16:16:42 +01:00
Daniel Mitterdorfer
6e15229f6e
Make counted terms agg visible to profiling (#105049)
The counted-terms aggregation is defined in its own plugin. When other
plugins (such as the profiling plugin) want to use this aggregation,
this leads to class loader issues, such as that the aggregation class is
not recognized. By moving just the aggregation code itself to the server
module but keeping everything else (including registration) in the
`mapper-counted-keyword` module, we can use the counted-terms
aggregation also from other plugins.
2024-02-02 15:56:07 +01:00
Andrei Dan
3f7db333c0
Mute testDataStreamLifecycleDownsampleRollingRestart (#105069) 2024-02-02 09:40:30 -05:00
Przemysław Witek
f452cc793e
[Transform] Add code comments for MIN_FREQUENCY and MAX_FREQUENCY constants (#105006) 2024-02-02 15:31:20 +01:00
Mary Gouseti
5879746ce2
Fix teardown in profiling test (#104852) (#105059) 2024-02-02 15:28:22 +02:00