Commit graph

13072 commits

Author SHA1 Message Date
David Turner
09df993931 AwaitsFix for #106618 2024-03-21 16:39:49 +00:00
Ignacio Vera
4bf910c6e8
Use LogDocMergePolicy in GeoPointScriptFieldDistanceFeatureQueryTests#testMatches (#106557) (#106562) 2024-03-20 12:24:13 -04:00
David Turner
80634b494f
Release TranslogSnapshot buffer after iteration (#106398) (#106556)
Closes #106390
2024-03-20 11:41:54 -04:00
David Turner
d281df79bd
Integrate threadpool scheduling with AbstractRunnable (#106542) (#106548)
Today `ThreadPool#scheduleWithFixedDelay` does not interact as expected
with `AbstractRunnable`: if the task fails or is rejected then this
isn't passed back to the relevant callback, and the task cannot specify
that it should be force-executed. This commit fixes that.
2024-03-20 10:50:02 -04:00
David Turner
21f1123113
Force execution of SearchService.Reaper (#106544) (#106547)
If the search threadpool fills up then we may reject execution of
`SearchService.Reaper` which means it stops retrying. We must instead
force its execution so that it keeps on going.

With #106542, closes #106543
2024-03-20 10:31:05 -04:00
Ignacio Vera
1ce9825bf0
Fix potential BigArray leak in InternalAggregation#getReducer (#106406) (#106412) 2024-03-18 09:30:24 -04:00
Athena Brown
97d4a86427
Adjust interception of requests for specific shard IDs (#101656) (#106376)
Some index requests target shard IDs specifically, which may not match the indices that the request targets as given by `IndicesRequest#indices()`, which requires a different interception strategy in order to make sure those requests are handled correctly in all cases and that any malformed messages are caught early to aid in troubleshooting.

This PR adds and interface allowing requests to report the shard IDs they target as well as the index names, and adjusts the interception of those requests as appropriate to handle those shard IDs in the cases where they are relevant.
2024-03-14 19:52:33 -04:00
Matteo Piergiovanni
56469e9507
[8.13] Field caps performance pt2 (#105941) (#106341) 2024-03-14 11:39:06 +01:00
Craig Taverner
88ff3c6613
Fix #106126 in 8.13 (#106318) 2024-03-13 20:03:21 +01:00
Ignacio Vera
5e14acae11
Disable parallel collection for terms aggregation with min_doc_count equals to 0 (#106156) (#106159) 2024-03-11 05:39:23 -04:00
Matteo Piergiovanni
4cc9f5cf64
Field-caps field has value lookup use map instead of looping array (#105770) (#106131)
(cherry picked from commit 35b2dbee2a)
2024-03-11 08:40:27 +01:00
Nik Everett
71a6b5e7ca
ESQL: Fix order in block loading tests (#106087) (#106089)
The tests for loading `Block`s from scripted fields could fail randomly
when the `RandomIndexWriter` shuffles the documents. This disables
merging and adds the documents as a block so their order is consistent.

Closes #106044
2024-03-07 16:33:24 -05:00
Jan Kuipers
5d830b08dc
Backport 106020 (#106058)
* Reset job if existing reset fails (#106020)

* Try again to reset a job if waiting for completion of an existing reset task fails.

* Update docs/changelog/106020.yaml

* Update 106020.yaml

* Update docs/changelog/106020.yaml

* Improve code

* Trigger rebuild
2024-03-07 07:32:46 -05:00
Benjamin Trent
727b8c2e9b
Test mute for #106044 (#106046) 2024-03-06 17:13:37 -05:00
Benjamin Trent
f864083230
Fix bug when nested knn pre-filter might match nested docs (#105994) (#106021)
When using a pre-filter with nested kNN vectors, its treated like a
top-level filter. Meaning, it is applied over parent document fields. 

However, there are times when a query filter is applied that may or may
not match internal nested or non-nested docs. We failed to handle this
case correctly and users would receive an error.

closes: https://github.com/elastic/elasticsearch/issues/105901
2024-03-06 09:45:30 -05:00
Craig Taverner
a1525a81c3
For cartesian values we are even more lenient with extremely large values (#106014) (#106017) 2024-03-06 08:47:22 -05:00
Benjamin Trent
bc57a519b7
Manually backport changes from #105578 (#105913) 2024-03-05 06:33:50 -05:00
Niels Bauman
111b22108d
Make Health API more resilient to multi-version clusters (#105789) (#105903)
First check whether the full cluster supports a specific indicator (feature) before we mark an indicator as "unknown" when (meta) data is missing from the cluster state.
2024-03-04 09:03:13 -05:00
Ryan Ernst
ce323cc495
Add reference docs links when jna fails to load (#105812) (#105818)
closes #105147
2024-02-26 16:58:29 -05:00
elasticsearchmachine
7133bbd188 Bump versions after 8.12.2 release 2024-02-22 17:42:19 +00:00
Moritz Mack
923ad1bc96
Fix EsAbortPolicy to not force execution if executor is already shutting down (#105666) (#105688)
Submitting a task during shutdown is highly unreliable and in almost all cases the task
will be rejected (removed) anyways. Not forcing execution if the executor is already
shutting down leads to more deterministic behavior and fixes
EsExecutorsTests.testFixedBoundedRejectOnShutdown.

(cherry picked from commit 954c428cde)
2024-02-21 07:05:04 -05:00
Felix Barnsteiner
72c7a0e69b
Fix parsing of flattened fields within subobjects: false (#105373) (#105638) 2024-02-20 02:59:09 -05:00
Ignacio Vera
e1a68fa7fd
GlobalOrdCardinalityAggregator should use HyperLogLogPlusPlus instead of HyperLogLogPlusPlusSparse (#105546) (#105602)
Use the generic HyperLogLogPlusPlus on GlobalOrdCardinalityAggregator so we promote the algorihtm to HLL when 
we reach the linear counting threshold.
2024-02-19 03:31:25 -05:00
David Turner
127da57578
Fix use-after-free at event-loop shutdown (#105486) (#105575)
We could still be manipulating a network message when the event loop
shuts down, causing us to close the message while it's still in use.
This is at best going to be a little surprising to the caller, and at
worst could be an outright use-after-free bug.

This commit moves the double-check for a leaked promise to happen
strictly after the event loop has fully terminated, so that we can be
sure we've finished using it by this point.

Relates #105306, #97301
2024-02-15 15:24:11 -05:00
Lee Hinman
97fd8a7e39
Always show composed_of field for composable index templates (#105315) (#105572)
* Always show `composed_of` field for composable index templates

Prior to e786cfa706 we inadvertently always added composable index
templates with `composed_of: []` beacuse
e786cfa706 (diff-5081302eb39033199deb1977d544d1cd7867212a92b8d77e0aa0ded361272b11L618-L630)
created a new `ComposableIndexTemplate` from an existing one, and the `.composedOf()` field returned
an empty list of no component templates were provided:

89e714ee5d/server/src/main/java/org/elasticsearch/cluster/metadata/ComposableIndexTemplate.java (L172-L177)

This meant that before 8.12.0 we would always show `composed_of: []` for composable index templates.
This commit recreates this behavior, and always displays the empty list even if no component
templates are used by a composable index template.

Resolves #104627
2024-02-15 14:24:45 -05:00
Simon Cooper
afd2dc61b2
Update min CCS version to that used by 8.12 (#104739) (#105547) 2024-02-15 12:30:06 +00:00
Ignacio Vera
1fca7257b0
Make CoordinateEncoder an abstract class (#105502) 2024-02-14 17:28:52 +01:00
Ievgen Degtiarenko
bb0e8a4d86
additional test logging (#105508)
This change enables the following logging for the test:
* refreshed cluster info to ensure allocator is seeing correct data
* allocator trace logging to check the balance computation is correct
* reconciler debug logging to check if there is anything unexpected during reconciliation
2024-02-14 16:29:29 +01:00
Benjamin Trent
a874f47dd8
Include better output in profiling & toString for automaton based queries (#105468)
We have various automaton based queries that build particular automatons
based on their usage. However, the input text isn't part of the
`toString` output, nor the usage of the current query (wildcard,
prefix,etc.).

This commit adds a couple of simple queries to wrap some of our logic to
make profiling and other output more readable.

Here is an example without this change:

```
#(-(winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2d13c057} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@28daf002} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@43c3d7f8} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@2f52905} winlog.event_data.TargetUserName:AutomatonQuery {
org.apache.lucene.util.automaton.Automaton@31d75074})
```

We have 5 case-insensitive automatons, but we don't know which is which
in the profiling output. All we know is the originating field. 

I don't think we can update `AutomatonQuery` directly as sometimes the
automaton created mutates the term (prefix for example) and we lose that
we are searching for a prefix.
2024-02-14 08:46:08 -05:00
Iraklis Psaroudakis
f871cb2364
Mute testClusterResolveDisconnectedAndErrorScenarios (#105492)
Relates #105489
2024-02-14 05:48:32 -05:00
Dmitry Cherniachenko
ba4d2f5843
Use Arrays.hashCode() for arrays in Objects.hash() calls (#105175)
When an array is passed to Objects.hash() it needs to be wrapped with Arrays.hashCode() for calculating the hash of the array content rather than using the array instance "identity hash code"
2024-02-14 09:29:11 +00:00
Simon Cooper
a0c21f853f
Map old release version id directly to their version, don't specify a range (#105335) 2024-02-14 09:24:25 +00:00
David Turner
c0e931af06
Detach persistent task execution from ThreadPool (#105460)
Similar to #99392, #97879 etc, no need to have the
`NodePersistentTasksExecutor` look up the executor to use each time, nor
does it necessarily need to use a named executor from the `ThreadPool`.
This commit pulls the lookup earlier in initialization so we can just
use a bare `Executor` instead.
2024-02-14 08:55:05 +00:00
Mary Gouseti
741c6327ca
[Health Monitoring] Stop the periodic health logger when es is stopping (#105272)
We see errors that we believe this is happening because `es` is already
stopping but the periodic health logger keeps querying the the health
API. Since the `es` stopping we believe it makes sense to also stop the
periodic health logger. 

Furthermore, we make the close method more respectful to the execution
of the periodic health logger which will wait for the last run to finish
if it's still in progress.

This PR makes the `HealthPeriodicLogger` lifecycle aware and uses a
semaphore to block the `close()` method.
2024-02-14 03:09:56 -05:00
Keith Massey
f0ec294382
Limiting the number of nested pipelines that can be executed (#105428)
Limiting the number of nested pipelines that can be executed within a single pipeline to 100
2024-02-13 16:28:31 -06:00
Keith Massey
c884945a93
Adding executedPipelines to the IngestDocument copy constructor (#105427) 2024-02-13 15:11:47 -06:00
David Kyle
ecb01405e7
[ML] Move the model parameter from task settings to service settings (#105458)
When configuring an OpenAI text embedding service the `model_id` should
have always been part of the service settings rather than task settings.
Task settings are overridable, service settings cannot be changed. If
different models are used the configured entities are considered
distinct. 

task_settings is now optional as it contains a single optional field
(`user`)

```
PUT _inference/text_embedding/openai_embeddings
{
  "service": "openai",
  "service_settings": {
    "api_key": "XXX",
    "model_id": "text-embedding-ada-002"
  }
}
```

Backwards compatibility with previously configured models is maintained
by moving the `model_id` (or `model`) from task settings to service
settings at the first stage of parsing. New configurations are persisted
with `model_id` in service settings, old configurations with `model_id`
in task settings are not modified and will be tolerated by a lenient
parser.
2024-02-13 13:44:33 -05:00
Max Hniebergall
89bf949555
[ML] Fix for inference modelId trained model deployment id collision (#105303)
* Fix for inference modelId trained model deployment id collision

* Add check for model already downloaded before put trained model
2024-02-13 12:31:07 -05:00
David Kyle
88f82b5c93
[ML] Return chunks for each input to InferenceService::chunkInfer (#105447) 2024-02-13 16:40:33 +00:00
Jack Conradson
b5828fbb67
Add plumbing to check cluster features in SearchSourceBuilder (#105417)
This change adds additional plumbing to pipe through the available cluster features into 
SearchSourceBuilder. A number of different APIs use SearchSourceBuilder so they had to make this 
available through their parsers as well often through ParserContext. This change is largely mechanical 
passing a Predicate into existing REST actions to check for feature availability.

Note that this change was pulled mostly from this PR (#105040).
2024-02-13 08:30:04 -08:00
Andrei Dan
87c23c734c
[DSL] Avoid reading the PREFER_ILM setting until needed (#105446) 2024-02-13 15:14:29 +00:00
Matteo Piergiovanni
0951b32c36
Test fix: handling engine closed when refreshing FieldInfos (#105374) 2024-02-13 10:22:38 +01:00
Keith Massey
e2b2232569
Improving the performance of the ingest simulate verbose API (#105265)
This updates the simulate verbose API to run in O(N) (for number of pipelines)
time and memory like the simulate and ingest APIs rather than O(N^2).
2024-02-12 16:04:21 -06:00
Mark Tozzi
03eadaa7dc
Clarify some javadoc (#105422)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-02-12 16:37:43 -05:00
David Turner
e2ff447b60
Improve RareClusterStateIT (#105390)
Today this test suite relies on being able to cancel an in-flight
publication after it's reached a committed state. This is questionable,
and also a little flaky in the presence of the desired balance allocator
which may introduce a short delay before enqueuing the cluster state
update that performs the reconciliation step.

This commit removes the questionable meddling with the internals of
`Coordinator` and instead just blocks the cluster state updates at the
transport layer to achieve the same effect.

Closes #102947
2024-02-12 15:43:29 -05:00
Dmitry Cherniachenko
a50e58d99a
Use single-char variant of String.indexOf() where possible (#105205)
* Use single-char variant of String.indexOf() where possible

indexOf(char) is more efficient than searching for the same one-character String.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-02-12 14:14:32 -05:00
Dmitry Cherniachenko
df4792e7ca
Remove a copy of IndexNameGenerator.generateValidIndexName() (#105132)
GenerateUniqueIndexNameStep contained the exact copies of the generateValidIndexName() and generateValidIndexSuffix() methods from the IndexNameGenerator utility class.

I removed the duplicates and changed the code to use the utility method instead.
Also added javadoc and switched to a pre-compiled Pattern.

The test was also broken as it checked the suffix to consist of only illegal characters.
Replacing matches() with find() makes it check for presence of at least one illegal character.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-02-12 14:11:40 -05:00
Keith Massey
5ccadaee7b
Adding a custom exception for problems with the graph of pipelines to be applied to a document (#105196)
This PR removes the need to parse the exception message to detect if a cycle has been detected
in the ingest pipelines to be run on a document.
2024-02-12 13:11:00 -06:00
Martijn van Groningen
6a6fba689a
Change PerFieldMapperCodec to use tsdb doc values codec for all fields. (#105301)
The index needs to be in tsdb mode. All fields will use the tsdb coded, except fields start with a _ (not excluding _tsid).

Before this change relies on MapperService to check whether a field needed to use tsdb doc values codec, but we missed many field types (ip field type, scaled float field type, unsigned long field type, etc.). Instead we wanted to depend on the doc values type in FieldInfo, but that information is not available in PerFieldMapperCodec.

Borrowed the binary doc values implementation from Lucene90DocValuesFormat. This allows it to be used for any doc values field.

Followup on #99747
2024-02-12 19:53:47 +01:00
Dmitry Cherniachenko
e21a4874ab
Use String.replace() instead of replaceAll() for non-regexp replacements (#105127)
* Use String.replace() instead of replaceAll() for non-regexp replacements

When arguments do not make use of regexp features replace() is a more efficient option, especially the char-variant.
2024-02-12 13:11:15 -05:00