Commit graph

77572 commits

Author SHA1 Message Date
Niels Bauman
e44ee4f5e0
Extend the Health API basic YAML tests (#108811)
The health node might not have received the health info from all
nodes yet before the execution of this test, resulting in an
"unknown" status. We make the status assertion more lenient to
allow for this uncertainty. Additionally, we add some more
assertions for the basic response structure of the other indicators.
2024-05-21 08:12:44 +02:00
elasticsearchmachine
e051e7ce9d [Automated] Update Lucene snapshot to 9.11.0-snapshot-26fca9e30c5 2024-05-21 06:10:20 +00:00
Ignacio Vera
090cee5a6f
Make AggregateMapper a singleton (#108828)
This class is immutable and created from a static list.
2024-05-21 08:06:00 +02:00
Keith Massey
93fdfe5ae4
Adding isSimulated methods to be used in simulate mapping validation work (#108791) 2024-05-20 16:12:31 -05:00
Nhat Nguyen
d305f64637
Close listener refs in finally block (#108830)
We will need to acquire a new listener in the catch block; therefore, we 
should not release the listenerRefs in the try-with-resources block,
which is executed before the catch block.

Relates #108580
2024-05-20 11:56:19 -07:00
Nhat Nguyen
05a20467c1
Harden field-caps request dispatcher (#108736)
ExceptionHelper#useAndSuppress can throw exceptions if both input 
exceptions having the same root cause. If this happens, the field-caps
request dispatcher might fail to notify the completion to the caller. I
found this while running ES|QL with disruptions.

Relates #107347
2024-05-20 10:58:00 -07:00
Ignacio Vera
60777cf4ab
Allow LuceneSourceOperator to early terminate (#108820)
We can make the LuceneSourceOperator collector to throw a CollectionTerminatedException whenever it has collected enough documents.
2024-05-20 19:53:06 +02:00
Nik Everett
4e1b8bf0f1
ESQL: Speed topn back up (#108650)
We noticed a regression in performance for topn last week. It turns out
that we had turned off support for skipping non-competitive docs. We
shouldn't do that!

Closes #108565
2024-05-20 13:49:47 -04:00
Armin Braun
6b508cb034
Reduce footprint of DynamicTemplate (#108617)
A lot of these lists are empty most of the time, we can save memory here
by moving to immutable lists. Found in a heap dump where this saves
about 10M of heap.
2024-05-20 18:13:05 +02:00
Ryan Ernst
9e6fe11d19
Update ASM to 9.7 for plugin scanner (#108822)
This commit updates the ASM library in order to support class files
written with Java 23.

closes #108776
2024-05-20 11:56:23 -04:00
Parker Timmins
3662d12c9f
Return ingest byte stats even when 0-valued (#108796)
Change the ingest byte stats to always be returned
whether or not they have a value of 0. Add human readable
form of byte stats. Update docs to reflect changes.
2024-05-20 10:52:16 -05:00
Ryan Ernst
bc499e7c83
Move rlimit calls into NativeAccess (#108805)
This commit moves getting max threads, max virtual memory size, and max
file size into NativeAccess.

relates https://github.com/elastic/elasticsearch/pull/104876
2024-05-20 11:09:50 -04:00
Carlos Delgado
0c35e13868
Avoid using dynamic templates for semantic text fields (#108771) 2024-05-20 17:03:07 +02:00
Parker Timmins
c5a3342449
Test pipeline run after reroute (#108693)
Add test confirming that pipelines are run after a reroute.
Fix test of two stage reroute. Delete pipelines during teardown
so as to not break other tests using name pipeline name.

Co-authored-by: Joe Gallo <joegallo@gmail.com>
2024-05-20 10:02:04 -05:00
Mike Pellegrini
8bc1a47588
Semantic Query (#108483)
Add the semantic query to the Query DSL, which is used to query semantic_text fields

---------

Co-authored-by: carlosdelest <carlos.delgado@elastic.co>
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
2024-05-20 09:36:22 -04:00
Ryan Ernst
29d6023de8
Rename MockLogAppender to MockLog (#108803)
Now that mock logging has a single internal appender, the "appender"
suffix for MockLogAppender doesn't make sense. This commit renames the
class to MockLog. It was completely mechanical, done with IntelliJ
renames.
2024-05-20 06:22:49 -07:00
elasticsearchmachine
06461d30ea Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-20 10:01:54 +00:00
Armin Braun
bac320829c
Simplify FetchSearchPhase and its tests a little (#108806)
We can dry up the tests a little, remove a branch that is never taken
(equality of response object and `Integer` is always false there)
and remove redundant arguments in the production code to simplify this
code a little.
2024-05-20 11:16:47 +02:00
Ignacio Vera
ba0073ad88
Enable inter-segment concurrency for low cardinality numeric terms aggs (#108306)
This commit enables inter-segment search concurrency for numeric terms aggs over long, integer and short field types.
 It estimates the cardinality by computing the min and max value of the shard using the BKD tree. When the estimated 
cardinality of the field being aggregated on is lower than the shard size then inter-segment concurrency is enabled.
2024-05-20 10:38:27 +02:00
elasticsearchmachine
4efa7fd3b6 [Automated] Update Lucene snapshot to 9.11.0-snapshot-5e48fdd0d45 2024-05-20 06:10:50 +00:00
Nhat Nguyen
c5a1bcf9de
Mute metrics command in CCS (#108816)
Tracked at #108815
2024-05-19 22:55:57 -07:00
Max Hniebergall
a2008bd190
[ML] Add option to disable inference process cache by default (#108784)
* Add option to disable inference process cache by default

* Add test

* improve tests

* Update docs and improve code

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-05-19 11:11:02 -04:00
Chris Hegarty
a7e4423834
Fix multithreading copies in lib vec (#108802)
This commit fixes a potential multithreading issue with the lib vec
vector scorer. 

Since the implementation falls back to a lucene scorer which needs to
read from the index input, then we need to make a copy of the index
input. Otherwise, there is a potential for the stateful index input to
be accessed across threads - which would be bad.

The fallback is only used when one or other vector cross a segment
boundary, which is 16G by default. So the likelihood of this occurring
in practice is small, but the affect is bad. 

The fix is deliberately small and targeted, so that it can be
backported. After this change, I'm going to drop the custom VectorScorer
and adapter type, in favour of using the Lucene type directly. This
custom types were initially used when the code lived inside the native
module, where we didn't want to add a dependency on Lucene directly.
2024-05-19 10:23:24 -04:00
Chris Hegarty
c59322e5c6 AwaitsFix: https://github.com/elastic/elasticsearch/issues/108809 2024-05-19 12:13:02 +01:00
Chris Hegarty
9b0ac34a87 AwaitsFix: https://github.com/elastic/elasticsearch/issues/108808 2024-05-19 12:09:59 +01:00
elasticsearchmachine
ef8b57fba0 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-19 10:01:36 +00:00
elasticsearchmachine
ab46fc40d0 [Automated] Update Lucene snapshot to 9.11.0-snapshot-5e48fdd0d45 2024-05-19 06:10:15 +00:00
Ryan Ernst
449632c22e
Migrate remaining MockLogAppender.capturing to capture (#108781)
This commit removes the remaining tests constructing a MockLogAppender
directly and makes the constructor private.
2024-05-18 09:10:21 -07:00
elasticsearchmachine
c9cd48ad97 Merge remote-tracking branch 'origin/main' into lucene_snapshot 2024-05-18 10:01:36 +00:00
elasticsearchmachine
0ce4398a8c [Automated] Update Lucene snapshot to 9.11.0-snapshot-5e48fdd0d45 2024-05-18 06:10:53 +00:00
Ryan Ernst
064655ef1f
Add support for JDK 23 ea builds (#108787)
This commit adjusts the openjdk toolchain resolver to support the JDK 23
ea builds.
2024-05-17 16:51:42 -04:00
Mikhail Berezovskiy
38283c83f0
Refactor registerRepository method (#108788)
This PR is a syntactic change for `registerRepository` in
`RepositoriesService`. I use `SubscribableListener` to display order of
events and reduce boilerplate code around failures delegation
`listener.delegateFailureAndWrap`.

It's a part of larger change for verification logic, which should take
advantage of this "sequential" version of code. #108531
2024-05-17 16:12:03 -04:00
Armin Braun
8ff8eff659
Use zero page "holes" to optimize sparse byte array usage (#108709)
Add the notion of a "zero page" or "hole" to big arrays. We have some use cases where we run up byte arrays of hundreds of MB that are extremely sparse.
Each page starts out as a "hole" and only gets replaced by a real page from the pool on write similar to how FS holes work.
This change adds a small amount of overhead to the write side but is performance neutral or better on the read side (for sparse arrays we likely get a big improvement from using less CPU cache).

The only change outside of the array itself this needed was in CCR, see inline comments for that.
2024-05-17 22:02:14 +02:00
Moritz Mack
befb6ff332
Add capabilities to known test runner features to fix capabilities integration in YAML tests (#108789)
🤦 Annoying oversight, this enables capabilities in YAML tests
2024-05-17 13:48:55 -04:00
Parker Timmins
298c6492a5
Make ingest byte stat names more descriptive (#108786)
Current ingest byte stat fields could easily be confused.
Add more descriptive name to make it clear that they do not
count all docs processed by the pipeline.
2024-05-17 12:03:42 -05:00
shainaraskas
9759823fe8
[DOCS] Reinforce connection between rollover and index age (#108588) 2024-05-17 12:52:42 -04:00
Luigi Dell'Aquila
5a52642db7
ES|QL: Fix WildcardLikeTests (#108779) 2024-05-17 18:20:17 +02:00
Oleksandr Kolomiiets
a454ac1987
Do not produce infinity values in synthetic source for range fields (#108699) 2024-05-17 09:19:14 -07:00
Nhat Nguyen
024ea6ca4c
Fix metrics_syntax capabilities (#108783)
The metrics command is available only in snapshot builds. However, in my 
previous PR, I mistakenly included the counter_types feature instead.
2024-05-17 09:11:56 -07:00
Nik Everett
dff3bd2c83
Abstract RowInTable logic (#108696)
This moves the logic for finding the offset in a table that we will use
in `LOOKUP` from a method on `BlockHash` and some complex building logic
in `HashLookupOperator`. Now it's in an `RowInTable` interface - both
a static builder method and some implementations.

There are three implementations:
1. One that talks to `BlockHash` just like `HashLookupOperator` used to.
   Right now it talks to `PackedValuesBlockHash` because it's the only
   one who's `lookup` method returns the offset in the original row, but
   we'll fix it eventually.
2. A `RowInTable` that works with increasing sequences of integers,
   say, `1, 2, 3, 4, 5` - this is fairly simple - it just checks that
   the input is between `1` and `5` and, if it is, subtracts `1`. Easy.
   Obvious. And very very fast. Simple. Good simple example.
3. An `RowInTable` that handles empty tables - this just makes
   writing the rest of the code simpler. It always returns `null`.
2024-05-17 12:05:26 -04:00
Joe Gallo
e1b2b599de
Add continent_code support to the geoip processor (#108780) 2024-05-17 11:48:23 -04:00
Max Hniebergall
0f147d4aff
[ML] Fix deleting a trained model can emit deprecation warnings related to ingest pipeline configs (#108679)
* fix dep warnings

* added tests and null handling

* Update docs/changelog/108679.yaml

* add missing test for null case

* Update 108679.yaml

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-05-17 11:42:43 -04:00
Nhat Nguyen
a2ab2fe0ff
Fix thread reinitialize in time-series source operator (#108751)
Currently, we only reinitialize the Lucene internal of the new top if 
the tsid changes. This isn't enough. We should always ensure the new top
is reinitialized if necessary, regardless of tsid.

Closes #108727
2024-05-17 08:24:40 -07:00
Iván Cea Fontenla
7b80843f5d
Check if CsvTests required capabilities exist (#108684)
Check that capabilities required in CSV tests really exist.

This avoids: - Having old capabilities (We aren't removing them afaik,
so may never happen) - Typos in capabilities

Currently, it would probably fail in the BWC tests. But this way we
avoid either waiting for them, or other potential errors.

_This change was extracted from [another
PR](https://github.com/elastic/elasticsearch/pull/108574) where there
was such typo in a commit_
2024-05-17 11:05:01 -04:00
Max Hniebergall
ca2ce0e3dd
[ML] Replace objects with primitives in Text Embedding Results classes (#108161)
* tests pass

* Update docs/changelog/108161.yaml

* precommit

* merge

* remove uncessary comments

* fix syntax error in test-service-plugin

* create Embedding.of to handle conversion from List of objects

* Update docs/changelog/108161.yaml

* Update 108161.yaml

* fix merge conflicts

* Update docs/changelog/108161.yaml

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-05-17 10:51:30 -04:00
Mark J. Hoy
b3a902e035
Add Docs for Azure AI Studio Support for the Inference API (#108737)
* add docs and embeddings tutorial pieces

* cleanup openai reference

* Suggested cleanups; add missing div tag

* one more change for clarity (requests per minute)
2024-05-17 10:35:43 -04:00
Ryan Ernst
f1093fb5e5
Use assertThatLogger where appropriate (#108732)
Some tests use MockLogAppender to assert on a single expected logging
message. The utility method assertThatLogger handles creating the
appender and asserting the expecation. However some other tests want to
do the same but with multiple expectations. This commit adjusts
assertThatLogger to allow multiple expectations, and converts a few
tests that had helper methods that are now obsoleted.
2024-05-17 07:27:51 -07:00
Larisa Motova
a01baa3d79
Include doc size info in ingest stats (#107240)
Add ingested_in_bytes and produced_in_bytes stats to pipeline ingest stats.
These track how many bytes are ingested and produced by a given pipeline.
For efficiency, these stats are recorded for the first pipeline to process a 
document. Thus, if a pipeline is called as a final pipeline after a default pipeline,
as a pipeline processor, and after a reroute request, a document will not 
contribute to the stats for that pipeline. If a given pipeline has 0 bytes recorded
for both of these stats, due to not being the first pipeline to run any doc, these
stats will not appear in the pipeline's entry in ingest stats.
2024-05-17 08:53:24 -05:00
Mark Tozzi
9f54d9a804
add tests for SimplifyComparisonArithmetics optimization rule (#108744)
This adds in the tests from OptimizerRunTests in SQL to apply to ESQL. I've opened issues and applied the AwaitsFix annotation for those of the tests that are currently failing.
2024-05-17 09:43:48 -04:00
Jedr Blaszyk
bb5cac9e64
[Connector API] Improve create connector endpoints (#108766) 2024-05-17 15:30:14 +02:00