Commit graph

13082 commits

Author SHA1 Message Date
Ignacio Vera
c3a72e9d8a
Add test to exercise reduction of terms aggregation order by key and fix pruning bug (#106799) (#106814)
We are not computing the otherDocCounts properly as we are exiting the iteration too early so we are not counting the 
pruned buckets. This commit make sure we are counting all buckets.
2024-03-27 10:19:47 -04:00
Kostas Krikellas
10d088de78
[8.13] Handle pass-through subfields with deep nesting (#106798) 2024-03-27 05:09:02 -04:00
Jan Kuipers
ad39c3f34f
Add background filters of significant terms aggregations to can match query. (#106564) (#106797)
* Add background filters of significant terms aggregations to can match query.

* Fix NPE

* Unit tests

* Update docs/changelog/106564.yaml

* Update 106564.yaml

* Make aggregation queries in can match phase more generic.

* Copy source to preserve other relevant fields.

* Replace copy constructor by shallowCopy
2024-03-27 04:44:10 -04:00
elasticsearchmachine
d903eb5067 Bump versions after 8.13.0 release 2024-03-26 20:21:16 +00:00
elasticsearchmachine
16ac6d2e6b Bump versions after 7.17.19 release 2024-03-26 20:00:24 +00:00
Michael Peterson
23628b0009
Adjust randomization in ResolveClusterActionResponseTests (#105932) (#106756)
to avoid failures in `testEqualsAndHashcode` tests.

Fixes https://github.com/elastic/elasticsearch/issues/105898
2024-03-26 13:22:29 -04:00
Luca Cavanna
57b0de7318
Fix concurrency bug in AbstractStringScriptFieldAutomatonQuery (#106678)
Back when we introduced queries against runtime fields, Elasticsearch did not support
inter-segment concurrency yet. At the time, it was fine to assume that segments will be
searched sequentially. AbstractStringScriptFieldAutomatonQuery used to have a BytesRefBuilder
instance shared across the segments, which gets re-initialized when each segment starts its work.
This is no longer possible with inter-segment concurrency.

Closes #105911
2024-03-26 17:39:31 +01:00
Benjamin Trent
7de0c3d7c7
Test mute for issue #106647 (#106671) 2024-03-22 09:19:47 -04:00
David Turner
72aa514922
Fix testScheduledFixedDelayRejection (#106630) (#106642)
Not really necessary to allow the scheduled task to race against the
blocks, and this race is a source of test flakiness. Fixed by imposing
the blocks first.

Closes #106618
2024-03-22 04:58:13 -04:00
Lorenzo Dematté
672d14d058
AwaitsFix #101008 (#106646) 2024-03-22 04:46:08 -04:00
David Turner
09df993931 AwaitsFix for #106618 2024-03-21 16:39:49 +00:00
Ignacio Vera
4bf910c6e8
Use LogDocMergePolicy in GeoPointScriptFieldDistanceFeatureQueryTests#testMatches (#106557) (#106562) 2024-03-20 12:24:13 -04:00
David Turner
80634b494f
Release TranslogSnapshot buffer after iteration (#106398) (#106556)
Closes #106390
2024-03-20 11:41:54 -04:00
David Turner
d281df79bd
Integrate threadpool scheduling with AbstractRunnable (#106542) (#106548)
Today `ThreadPool#scheduleWithFixedDelay` does not interact as expected
with `AbstractRunnable`: if the task fails or is rejected then this
isn't passed back to the relevant callback, and the task cannot specify
that it should be force-executed. This commit fixes that.
2024-03-20 10:50:02 -04:00
David Turner
21f1123113
Force execution of SearchService.Reaper (#106544) (#106547)
If the search threadpool fills up then we may reject execution of
`SearchService.Reaper` which means it stops retrying. We must instead
force its execution so that it keeps on going.

With #106542, closes #106543
2024-03-20 10:31:05 -04:00
Ignacio Vera
1ce9825bf0
Fix potential BigArray leak in InternalAggregation#getReducer (#106406) (#106412) 2024-03-18 09:30:24 -04:00
Athena Brown
97d4a86427
Adjust interception of requests for specific shard IDs (#101656) (#106376)
Some index requests target shard IDs specifically, which may not match the indices that the request targets as given by `IndicesRequest#indices()`, which requires a different interception strategy in order to make sure those requests are handled correctly in all cases and that any malformed messages are caught early to aid in troubleshooting.

This PR adds and interface allowing requests to report the shard IDs they target as well as the index names, and adjusts the interception of those requests as appropriate to handle those shard IDs in the cases where they are relevant.
2024-03-14 19:52:33 -04:00
Matteo Piergiovanni
56469e9507
[8.13] Field caps performance pt2 (#105941) (#106341) 2024-03-14 11:39:06 +01:00
Craig Taverner
88ff3c6613
Fix #106126 in 8.13 (#106318) 2024-03-13 20:03:21 +01:00
Ignacio Vera
5e14acae11
Disable parallel collection for terms aggregation with min_doc_count equals to 0 (#106156) (#106159) 2024-03-11 05:39:23 -04:00
Matteo Piergiovanni
4cc9f5cf64
Field-caps field has value lookup use map instead of looping array (#105770) (#106131)
(cherry picked from commit 35b2dbee2a)
2024-03-11 08:40:27 +01:00
Nik Everett
71a6b5e7ca
ESQL: Fix order in block loading tests (#106087) (#106089)
The tests for loading `Block`s from scripted fields could fail randomly
when the `RandomIndexWriter` shuffles the documents. This disables
merging and adds the documents as a block so their order is consistent.

Closes #106044
2024-03-07 16:33:24 -05:00
Jan Kuipers
5d830b08dc
Backport 106020 (#106058)
* Reset job if existing reset fails (#106020)

* Try again to reset a job if waiting for completion of an existing reset task fails.

* Update docs/changelog/106020.yaml

* Update 106020.yaml

* Update docs/changelog/106020.yaml

* Improve code

* Trigger rebuild
2024-03-07 07:32:46 -05:00
Benjamin Trent
727b8c2e9b
Test mute for #106044 (#106046) 2024-03-06 17:13:37 -05:00
Benjamin Trent
f864083230
Fix bug when nested knn pre-filter might match nested docs (#105994) (#106021)
When using a pre-filter with nested kNN vectors, its treated like a
top-level filter. Meaning, it is applied over parent document fields. 

However, there are times when a query filter is applied that may or may
not match internal nested or non-nested docs. We failed to handle this
case correctly and users would receive an error.

closes: https://github.com/elastic/elasticsearch/issues/105901
2024-03-06 09:45:30 -05:00
Craig Taverner
a1525a81c3
For cartesian values we are even more lenient with extremely large values (#106014) (#106017) 2024-03-06 08:47:22 -05:00
Benjamin Trent
bc57a519b7
Manually backport changes from #105578 (#105913) 2024-03-05 06:33:50 -05:00
Niels Bauman
111b22108d
Make Health API more resilient to multi-version clusters (#105789) (#105903)
First check whether the full cluster supports a specific indicator (feature) before we mark an indicator as "unknown" when (meta) data is missing from the cluster state.
2024-03-04 09:03:13 -05:00
Ryan Ernst
ce323cc495
Add reference docs links when jna fails to load (#105812) (#105818)
closes #105147
2024-02-26 16:58:29 -05:00
elasticsearchmachine
7133bbd188 Bump versions after 8.12.2 release 2024-02-22 17:42:19 +00:00
Moritz Mack
923ad1bc96
Fix EsAbortPolicy to not force execution if executor is already shutting down (#105666) (#105688)
Submitting a task during shutdown is highly unreliable and in almost all cases the task
will be rejected (removed) anyways. Not forcing execution if the executor is already
shutting down leads to more deterministic behavior and fixes
EsExecutorsTests.testFixedBoundedRejectOnShutdown.

(cherry picked from commit 954c428cde)
2024-02-21 07:05:04 -05:00
Felix Barnsteiner
72c7a0e69b
Fix parsing of flattened fields within subobjects: false (#105373) (#105638) 2024-02-20 02:59:09 -05:00
Ignacio Vera
e1a68fa7fd
GlobalOrdCardinalityAggregator should use HyperLogLogPlusPlus instead of HyperLogLogPlusPlusSparse (#105546) (#105602)
Use the generic HyperLogLogPlusPlus on GlobalOrdCardinalityAggregator so we promote the algorihtm to HLL when 
we reach the linear counting threshold.
2024-02-19 03:31:25 -05:00
David Turner
127da57578
Fix use-after-free at event-loop shutdown (#105486) (#105575)
We could still be manipulating a network message when the event loop
shuts down, causing us to close the message while it's still in use.
This is at best going to be a little surprising to the caller, and at
worst could be an outright use-after-free bug.

This commit moves the double-check for a leaked promise to happen
strictly after the event loop has fully terminated, so that we can be
sure we've finished using it by this point.

Relates #105306, #97301
2024-02-15 15:24:11 -05:00
Lee Hinman
97fd8a7e39
Always show composed_of field for composable index templates (#105315) (#105572)
* Always show `composed_of` field for composable index templates

Prior to e786cfa706 we inadvertently always added composable index
templates with `composed_of: []` beacuse
e786cfa706 (diff-5081302eb39033199deb1977d544d1cd7867212a92b8d77e0aa0ded361272b11L618-L630)
created a new `ComposableIndexTemplate` from an existing one, and the `.composedOf()` field returned
an empty list of no component templates were provided:

89e714ee5d/server/src/main/java/org/elasticsearch/cluster/metadata/ComposableIndexTemplate.java (L172-L177)

This meant that before 8.12.0 we would always show `composed_of: []` for composable index templates.
This commit recreates this behavior, and always displays the empty list even if no component
templates are used by a composable index template.

Resolves #104627
2024-02-15 14:24:45 -05:00
Simon Cooper
afd2dc61b2
Update min CCS version to that used by 8.12 (#104739) (#105547) 2024-02-15 12:30:06 +00:00
Ignacio Vera
1fca7257b0
Make CoordinateEncoder an abstract class (#105502) 2024-02-14 17:28:52 +01:00
Ievgen Degtiarenko
bb0e8a4d86
additional test logging (#105508)
This change enables the following logging for the test:
* refreshed cluster info to ensure allocator is seeing correct data
* allocator trace logging to check the balance computation is correct
* reconciler debug logging to check if there is anything unexpected during reconciliation
2024-02-14 16:29:29 +01:00
Benjamin Trent
a874f47dd8
Include better output in profiling & toString for automaton based queries (#105468)
We have various automaton based queries that build particular automatons
based on their usage. However, the input text isn't part of the
`toString` output, nor the usage of the current query (wildcard,
prefix,etc.).

This commit adds a couple of simple queries to wrap some of our logic to
make profiling and other output more readable.

Here is an example without this change:

```
#(-(winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2d13c057} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@28daf002} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@43c3d7f8} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2f52905} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@31d75074})
```

We have 5 case-insensitive automatons, but we don't know which is which
in the profiling output. All we know is the originating field. 

I don't think we can update `AutomatonQuery` directly as sometimes the
automaton created mutates the term (prefix for example) and we lose that
we are searching for a prefix.
2024-02-14 08:46:08 -05:00
Iraklis Psaroudakis
f871cb2364
Mute testClusterResolveDisconnectedAndErrorScenarios (#105492)
Relates #105489
2024-02-14 05:48:32 -05:00
Dmitry Cherniachenko
ba4d2f5843
Use Arrays.hashCode() for arrays in Objects.hash() calls (#105175)
When an array is passed to Objects.hash() it needs to be wrapped with Arrays.hashCode() for calculating the hash of the array content rather than using the array instance "identity hash code"
2024-02-14 09:29:11 +00:00
Simon Cooper
a0c21f853f
Map old release version id directly to their version, don't specify a range (#105335) 2024-02-14 09:24:25 +00:00
David Turner
c0e931af06
Detach persistent task execution from ThreadPool (#105460)
Similar to #99392, #97879 etc, no need to have the
`NodePersistentTasksExecutor` look up the executor to use each time, nor
does it necessarily need to use a named executor from the `ThreadPool`.
This commit pulls the lookup earlier in initialization so we can just
use a bare `Executor` instead.
2024-02-14 08:55:05 +00:00
Mary Gouseti
741c6327ca
[Health Monitoring] Stop the periodic health logger when es is stopping (#105272)
We see errors that we believe this is happening because `es` is already
stopping but the periodic health logger keeps querying the the health
API. Since the `es` stopping we believe it makes sense to also stop the
periodic health logger. 

Furthermore, we make the close method more respectful to the execution
of the periodic health logger which will wait for the last run to finish
if it's still in progress.

This PR makes the `HealthPeriodicLogger` lifecycle aware and uses a
semaphore to block the `close()` method.
2024-02-14 03:09:56 -05:00
Keith Massey
f0ec294382
Limiting the number of nested pipelines that can be executed (#105428)
Limiting the number of nested pipelines that can be executed within a single pipeline to 100
2024-02-13 16:28:31 -06:00
Keith Massey
c884945a93
Adding executedPipelines to the IngestDocument copy constructor (#105427) 2024-02-13 15:11:47 -06:00
David Kyle
ecb01405e7
[ML] Move the model parameter from task settings to service settings (#105458)
When configuring an OpenAI text embedding service the `model_id` should
have always been part of the service settings rather than task settings.
Task settings are overridable, service settings cannot be changed. If
different models are used the configured entities are considered
distinct. 

task_settings is now optional as it contains a single optional field
(`user`)

```
PUT _inference/text_embedding/openai_embeddings
{
  "service": "openai",
  "service_settings": {
    "api_key": "XXX",
    "model_id": "text-embedding-ada-002"
  }
}
```

Backwards compatibility with previously configured models is maintained
by moving the `model_id` (or `model`) from task settings to service
settings at the first stage of parsing. New configurations are persisted
with `model_id` in service settings, old configurations with `model_id`
in task settings are not modified and will be tolerated by a lenient
parser.
2024-02-13 13:44:33 -05:00
Max Hniebergall
89bf949555
[ML] Fix for inference modelId trained model deployment id collision (#105303)
* Fix for inference modelId trained model deployment id collision

* Add check for model already downloaded before put trained model
2024-02-13 12:31:07 -05:00
David Kyle
88f82b5c93
[ML] Return chunks for each input to InferenceService::chunkInfer (#105447) 2024-02-13 16:40:33 +00:00
Jack Conradson
b5828fbb67
Add plumbing to check cluster features in SearchSourceBuilder (#105417)
This change adds additional plumbing to pipe through the available cluster features into 
SearchSourceBuilder. A number of different APIs use SearchSourceBuilder so they had to make this 
available through their parsers as well often through ParserContext. This change is largely mechanical 
passing a Predicate into existing REST actions to check for feature availability.

Note that this change was pulled mostly from this PR (#105040).
2024-02-13 08:30:04 -08:00