Commit graph

75574 commits

Author SHA1 Message Date
Benjamin Trent
a5838512fb
Minor refactors to knn search & mapper (#104944) 2024-01-31 10:36:23 -05:00
Nhat Nguyen
cdb49076ea
Adjust enrich policy not found error message (#104945)
Thanks Andrei!
2024-01-31 07:35:56 -08:00
Chenhui Wang
fa97f08df1
[Connector API] Add job type filtering support for List connector sync jobs API (#104855) 2024-01-31 23:19:45 +08:00
Nhat Nguyen
c8c16a57b2
Avoid implicit casting in ESQL SearchStats (#104947)
./gradlew precommit fails with JDK21 and we should use longs instead of ints.
2024-01-31 07:14:03 -08:00
Alexander Spies
1ef8beca60
ESQL: Pre-allocate rows in TopNOperator (#104796)
Starting with empty rows and growing them causes lots of allocations and
thus bad performance in case of many large fields being contained in the
rows.
Instead, use the previously encountered row to estimate the size of the next row.
2024-01-31 15:55:04 +01:00
Armin Braun
6458cba23f
Fix unnecessary allocations in ChildMemoryCircuitBreaker (#104972)
This line allocates GB/min for the capturing lambda during hot indexing
according to async-profiler. No need for that.
2024-01-31 09:04:25 -05:00
Yang Wang
1fd2756f8c
MockLogAppender takes string logger names for capturing (#104971)
For classes that are not publically accessible, we can use the cannoical
names for capturing.
2024-01-31 08:45:15 -05:00
Armin Braun
18bd6c4238
Fix Releasables.close performance issues (#104970)
It's less code and it actually inlines (avoiding virtual calls in most
cases) to just do the null check here instead of delegating to IOUtils
and then catching the impossible IOException. Also, no need to use
`Releaseables` in 2 spots where try-with-resources works as well and
needs less code.

Noticed this when I saw that we had a lot of strange CPU overhead in
this call in some hot loops like translog writes.
2024-01-31 08:21:59 -05:00
Ignacio Vera
3821745880
Move functions that generate lucene geometries under a utility class (#104928)
We have functions that generate lucene geometries scattered in different places of the code. This commit moves 
everything under a utility class.
2024-01-31 13:59:33 +01:00
Lorenzo Dematté
5013ea4b18
Fix test assertions (#104963) 2024-01-31 13:04:24 +01:00
David Kyle
2cbe23a189
[DOCS] Dense vector element type should be float for OpenAI (#104966) 2024-01-31 11:13:03 +00:00
Alexander Spies
1da5f99f45
ESQL: Correct out-of-range filter pushdowns (#99961)
Fix pushed down filters for binary comparisons that compare a
byte/short/int/long with an out of range value, like
WHERE some_int_field < 1E300.
2024-01-31 12:10:15 +01:00
Armin Braun
50bafd306c
Save allocating enum values array in two hot spots (#104952)
Our readEnum code instantiates/clones enum value arrays on read.
Normally, this doesn't matter much but the two spots adjusted here are
visibly hot during bulk indexing, causing GBs of allocations during e.g.
the http_logs indexing run.
2024-01-31 11:26:36 +01:00
Jedr Blaszyk
9589496669
[Connector API] Make update configuration action non-additive (#104615) 2024-01-31 11:25:48 +01:00
Luigi Dell'Aquila
1daa324b0a
ES|QL: Improve type validation in aggs for UNSIGNED_LONG and better support for VERSION (#104911) 2024-01-31 10:02:38 +01:00
Lorenzo Dematté
7764fdb3ea
Exclude tests that do not work in a mixed cluster scenario (#104935) 2024-01-31 09:32:21 +01:00
Moritz Mack
dbf59c5414
Update/Cleanup references to old tracing.apm.* legacy settings in favor of the telemetry.* settings (#104917) 2024-01-31 09:20:05 +01:00
Joe Gallo
4c44633056
Update versions to skip after backport to 8.12 (#104953) 2024-01-30 18:51:13 -05:00
Keith Massey
66a930ba46
Adding ActionRequestLazyBuilder implementation of RequestBuilder (#104927)
This introduces a second implementation of RequestBuilder (#104778). As opposed
to ActionRequestBuilder, ActionRequestLazyBuilder does not create its request
until the request() method is called, and does not hold onto that request (so each
call to request() gets a new request instance).
This PR also updates BulkRequestBuilder to inherit from ActionRequestLazyBuilder
as an example of its use.
2024-01-30 16:14:27 -06:00
Przemysław Witek
a4ddd32ec8
[Transform] Unmute 2 remaining continuous tests: HistogramGroupByIT and TermsGroupByIT (#104898) 2024-01-30 20:14:06 +01:00
Jonathan Buttner
92dd213dd7
[ML] Passing input type through to cohere request (#104781)
* Pushing input type through to cohere request

* switching logic to allow request to always override

* Fixing failure

* Removing getModelId calls

* Addressing feedback

* Switching to enumset
2024-01-30 13:46:49 -05:00
Abdon Pijpelink
980bc500b0
[DOCS] Support for nested functions in ES|QL STATS...BY (#104788)
* Document nested expressions for stats

* More docs

* Apply suggestions from review

- count-distinct.asciidoc
  - Content restructured, moving the section about approximate counts to end of doc.

- count.asciidoc
  - Clarified that omitting the `expression` parameter in `COUNT` is equivalent to `COUNT(*)`, which counts the number of rows.

- percentile.asciidoc
  - Moved the note about `PERCENTILE` being approximate and non-deterministic to end of doc.

- stats.asciidoc
  - Clarified the `STATS` command
  -  Added a note indicating that individual `null` values are skipped during aggregation

* Comment out mentioning a buggy behavior

* Update sum with inline function example, update test file

* Fix typo

* Delete line

* Simplify wording

* Fix conflict fix typo

---------

Co-authored-by: Liam Thompson <leemthompo@gmail.com>
Co-authored-by: Liam Thompson <32779855+leemthompo@users.noreply.github.com>
2024-01-30 19:29:12 +01:00
Nhat Nguyen
aea4684b52
Limit concurrent shards per node for ESQL (#104832)
Today, we allow ESQL to execute against an unlimited number of shards 
concurrently on each node. This can lead to cases where we open and hold
too many shards, equivalent to opening too many file descriptors or
using too much memory for FieldInfos in ValuesSourceReaderOperator.

This change limits the number of concurrent shards to 10 per node. This 
number was chosen based on the _search API, which limits it to 5.
Besides the primary reason stated above, this change has other
implications:

We might execute fewer shards for queries with LIMIT only, leading to 
scenarios where we execute only some high-priority shards then stop. 
For now, we don't have a partial reduce at the node level, but if we
introduce one in the future, it might not be as efficient as executing
all shards at the same time.  There are pauses between batches because
batches are executed sequentially one by one.  However, I believe the
performance of queries executing against many shards (after can_match)
is less important than resiliency.

Closes #103666
2024-01-30 09:52:04 -08:00
Jedr Blaszyk
d2d28ecc4f
[Connectors API] Relax strict response parsing for get/list operations (#104909) 2024-01-30 18:35:16 +01:00
Albert Zaharovits
111a69d15f
Support match for the Query API Key API (#104594)
This adds support for the `match` query type to the Query API key Information API.
Note that since string values associated to API Keys are mapped as `keywords`,
a `match` query with no analyzer parameter is effectively equivalent to a `term` query
for such fields (e.g. `name`, `username`, `realm_name`).

Relates: #101691
2024-01-30 19:09:08 +02:00
Mark Vieira
0623eb08a8
Apply publish plugin to es-opensaml-security-api project (#104933) 2024-01-30 09:04:50 -08:00
Albert Zaharovits
2a5dfde853
Add type parameter support, for sorting, to the Query API Key API (#104625)
This adds support for the `type` parameter, for sorting, to the Query API key API.
The type for an API Key can currently be either `rest` or `cross_cluster`.
This was overlooked in #103695 when support for the `type` parameter
was first introduced only for querying.
2024-01-30 19:01:36 +02:00
Moritz Mack
9ea187dd76
Fix enabling / disabling of APM agent "recording" in APMAgentSettings (#104324) 2024-01-30 17:29:21 +01:00
Keith Massey
e74a79fa54
Fixing a broken javadoc comment in ReindexDocumentationIT (#104930)
This fixes a javadoc comment that was broken by #104881
2024-01-30 11:18:25 -05:00
Jonathan Buttner
422e6f6b98
Adding request source for cohere (#104926) 2024-01-30 10:59:21 -05:00
Costin Leau
202a81f212
ESQL: Fix SearchStats#count(String) to count values not rows (#104891)
SearchStats#count incorrectly counts the number of documents (or rows)
 in which a document appears instead of the actual number of values.
This PR fixes this by looking at the term frequency instead of the doc
 count.

Fix #104795
2024-01-30 07:34:15 -08:00
Simon Cooper
4567a84c8d
Mute more tests that tend to leak searchhits (#104922) 2024-01-30 15:08:47 +00:00
Moritz Mack
a3b1d86c45
Reuse APMMeterService of APMTelemetryProvider (#104906) 2024-01-30 15:49:56 +01:00
István Zoltán Szabó
79d6c3e70d
[DOCS] Adds get setting and update settings asciidoc files to security API index (#104916)
* [DOCS] Adds get setting and update settings asciidoc files to security API index.

* [DOCS] Fixes references in docs.
2024-01-30 15:39:34 +01:00
Keith Massey
6c9551ae48
Removing the assumption from some tests that the request builder's request() method always returns the same object (#104881) 2024-01-30 08:21:27 -06:00
Benjamin Trent
332dd8c751
indicating fix for 8.12.1 for int8_hnsw (#104912) 2024-01-30 09:06:22 -05:00
Pooya Salehi
dbefb32bd7
Retry get_from_translog during relocations (#104579)
During a promotable relocation, a `get_from_translog` sent by the
unpromotable  shard to handle a real-time get might encounter
`ShardNotFoundException` or  `IndexNotFoundException`. In these cases,
we should retry.

This is just for `GET`. I'll open a second PR for `mGET`.  The relevant
IT is in the  Stateless PR.

Relates ES-5727
2024-01-30 08:50:03 -05:00
Tim Rühsen
4014d52696
[Profiling] Simplify cost calculation (#104816)
* [Profiling] Add the number of cores to HostMetadata

* Update AWS pricelist (remove cost_factor, add usd_per_hour)

* Switch cost calculations from 'cost_factor' to 'usd_per_hour'

* Remove superfluous CostEntry.toXContent()

* Check for Number type in CostEntry.fromSource()

* Add comment
2024-01-30 14:35:20 +01:00
Ignacio Vera
7c8bb145f1
Merge Aggregations into InternalAggregations (#104896)
This commit merges Aggregations into InternalAggregations in order to remove the unnecessary hierarchy.
2024-01-30 14:31:21 +01:00
Navarone Feekery
50afaad369
[Connector Secrets] Add delete API endpoint (#104815)
* Add DELETE endpoint for /_connector/_secret/{id}
* Add endpoint to write_connector_secrets cluster privilege
2024-01-30 13:39:15 +01:00
Jim Ferenczi
65498f5487
Get from translog fails with large dense_vector (#104700)
This change fixes the engine to apply the current codec when retrieving documents from the translog.
We need to use the same codec than the main index in order to ensure that all the source data is indexable.
The internal codec treats some fields differently than the default one, for instance dense_vectors are limited to 1024 dimensions.
This PR ensures that these customizations are applied when indexing document for translog retrieval.

Closes #104639

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-01-30 12:13:09 +00:00
David Turner
73df6d3043
Improve CANNOT_REBALANCE_CAN_ALLOCATE explanation (#104904)
Clarify that in this situation there is a rebalancing move that would
improve the cluster balance, but there's some reason why rebalancing is
not happening. Also points at the `can_rebalance_cluster_decisions` as
well as the node-by-node decisions since the action needed could be
described in either place.
2024-01-30 07:12:30 -05:00
Chris Hegarty
920beee009
Upgrade to Lucene 9.9.2 (#104753)
This commit upgrades to Lucene 9.9.2.
2024-01-30 12:01:26 +00:00
Simon Cooper
1395edf805
Change release version lookup to an instance method (#104902) 2024-01-30 11:02:16 +00:00
Johannes Fredén
666774a865
Add documentation for Query User API (#104255)
* Add documentation for Query User API

Co-authored-by: Nikolaj Volgushev <n1v0lg@users.noreply.github.com>
2024-01-30 11:27:24 +01:00
Yang Wang
5af58ed033
Improve logging for S3RetryingInputStream (#104892)
* Improve logging for S3RetryingInputStream

This PR adds a logging message when opening or read succeeds after
retries. See also
https://github.com/elastic/elasticsearch/pull/103300#issuecomment-1882130933

It also makes the logging messages for retries exponentially
less so that the log size is more scalable when there are many threads
retrying. See also
https://github.com/elastic/elasticsearch/pull/103300#discussion_r1452149913

Relates: #103300
2024-01-30 21:18:11 +11:00
Pooya Salehi
5138cfc055
Minor improvements to ClusterStateObserver docs (#104854) 2024-01-30 05:12:58 -05:00
Ignacio Vera
464928b596
Release resources in BestBucketsDeferringCollector earlier (#104893)
BestBucketsDeferringCollector holds the documents and buckets in memory to be replayed to the children 
aggregations. These objects can get large and they are not backed by BigArrays so let's release them as soon as they 
are consume.
2024-01-30 11:11:58 +01:00
Henning Andersen
f2d96442f6
Fix blob cache race, decay, time dependency (#104784)
This commit addresses 3 problems in the blob cache:

* Fix a race during initChunk where the result would be a fallback to direct read.
* Fix a bug in computeDecay that led to only decaying the first item per frequency.
* Remove the time dependency of the cache by moving to a logical clock (epochs)

Trigger decay whenever freq0 is empty, ensuring we decay slowly/rapidly as needed.

Divide time into epochs, switch to new one whenever we need to decay. A region
now promotes 2 freqs per access but only once per epoch

Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
2024-01-30 10:52:00 +01:00
Jan Kuipers
583d74a8e2
Mute tests that regularly leak SearchHits (#104853)
* Mute TransportRankEvalActionTests.testTransferRequestParameters

* Mute RatedRequestsTests.testXContentParsingIsNotLenient

* Only mute on Windows
2024-01-30 10:38:17 +01:00