Commit graph

4797 commits

Author SHA1 Message Date
Martijn van Groningen
af53aadbaf
[8.18] Improve resiliency of UpdateTimeSeriesRangeService (#126680)
Backporting #126637 to 8.18 branch.

If updating the `index.time_series.end_time` fails for one data stream,
then UpdateTimeSeriesRangeService should continue updating this setting for other data streams.

The following error was observed in the wild:

```
[2025-04-07T08:50:39,698][WARN ][o.e.d.UpdateTimeSeriesRangeService] [node-01] failed to update tsdb data stream end times
java.lang.IllegalArgumentException: [index.time_series.end_time] requires [index.mode=time_series]
        at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:636) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.index.IndexSettings$1.validate(IndexSettings.java:619) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.common.settings.Setting.get(Setting.java:563) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.common.settings.Setting.get(Setting.java:535) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.datastreams.UpdateTimeSeriesRangeService.updateTimeSeriesTemporalRange(UpdateTimeSeriesRangeService.java:111) ~[?:?]
        at org.elasticsearch.datastreams.UpdateTimeSeriesRangeService$UpdateTimeSeriesExecutor.execute(UpdateTimeSeriesRangeService.java:210) ~[?:?]
        at org.elasticsearch.cluster.service.MasterService.innerExecuteTasks(MasterService.java:1075) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:1038) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.cluster.service.MasterService.executeAndPublishBatch(MasterService.java:245) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.lambda$run$2(MasterService.java:1691) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.action.ActionListener.run(ActionListener.java:452) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.cluster.service.MasterService$BatchingTaskQueue$Processor.run(MasterService.java:1688) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.cluster.service.MasterService$5.lambda$doRun$0(MasterService.java:1283) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.action.ActionListener.run(ActionListener.java:452) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.cluster.service.MasterService$5.doRun(MasterService.java:1262) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1023) ~[elasticsearch-8.17.3.jar:?]
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:27) ~[elasticsearch-8.17.3.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) ~[?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) ~[?:?]
        at java.lang.Thread.run(Thread.java:1575) ~[?:?]
```

Which resulted in a situation, that causes the `index.time_series.end_time` index setting not being updated for any data stream. This then caused data loss as metrics couldn't be indexed, because no suitable backing index could be resolved:

```
the document timestamp [2025-03-26T15:26:10.000Z] is outside of ranges of currently writable indices [[2025-01-31T07:22:43.000Z,2025-02-15T07:24:06.000Z][2025-02-15T07:24:06.000Z,2025-03-02T07:34:07.000Z][2025-03-02T07:34:07.000Z,2025-03-10T12:45:37.000Z][2025-03-10T12:45:37.000Z,2025-03-10T14:30:37.000Z][2025-03-10T14:30:37.000Z,2025-03-25T12:50:40.000Z][2025-03-25T12:50:40.000Z,2025-03-25T14:35:40.000Z
```
2025-04-11 22:49:23 +10:00
Ben Chaplin
a305288410
[8.18] Log stack traces on data nodes before they are cleared for transport (#125732) (#126246)
* Log stack traces on data nodes before they are cleared for transport (#125732)

We recently cleared stack traces on data nodes before transport back to the coordinating node 
when error_trace=false to reduce unnecessary data transfer and memory on the coordinating 
node (#118266). However, all logging of exceptions happens on the coordinating node, so stack 
traces disappeared from any logs. This change logs stack traces directly on the data node when 
error_trace=false.

(cherry picked from commit 9f6eb1d4e3)
2025-04-04 11:44:38 -04:00
Mark Vieira
f236efc1d5
Convert remaining plugin projects to new test clusters framework (#125626) (#125726)
(cherry picked from commit 930b4ab995)

# Conflicts:
#	plugins/discovery-azure-classic/build.gradle
#	plugins/discovery-gce/qa/gce/build.gradle
2025-03-27 09:41:37 +11:00
Moritz Mack
fe4df54988
Prevent work starvation bug if using scaling EsThreadPoolExecutor with core pool size = 0 (#124732) (#125067)
When `ExecutorScalingQueue` rejects work to make the worker pool scale up while already being at max pool size (and a new worker consequently cannot be added), available workers might timeout just about at the same time as the task is then force queued by `ForceQueuePolicy`. This has caused starvation of work as observed for `masterService#updateTask` in #124667 where max pool size 1 is used. This configuration is most likely to expose the bug.

This PR changes `EsExecutors.newScaling` to not use `ExecutorScalingQueue` if max pool size is 1 (and core pool size is 0). A regular `LinkedTransferQueue` works perfectly fine in this case.

If max pool size > 1, a probing approach is used to ensure the worker pool is adequately scaled to at least 1 worker after force queueing work in `ForceQueuePolicy`.

Fixes #124667
Relates to #18613

(cherry picked from commit 36874e8663)

# Conflicts:
#	test/framework/src/main/java/org/elasticsearch/test/transport/MockTransportService.java
2025-03-18 10:48:37 +01:00
Luca Cavanna
255f7fecd0
[8.18] Fix concurrency issue in ScriptSortBuilder (#123757) (#124514)
* Fix concurrency issue in ScriptSortBuilder (#123757)

Inter-segment concurrency is disabled whenever sort by field, included script sorting, is used in a search request.

The reason why sort by field does not use concurrency is that there are some performance implications, given that the hit queue in Lucene is build per slice and the different search threads don't share information about the documents they have already visited etc.

The reason why script sort has concurrency disabled is that the script sorting implementation is not thread safe. This commit addresses such concurrency issue and re-enables search concurrency for search requests that use script sorting. In addition, missing tests are added to cover for sort scripts that rely on _score being available and top_hits aggregation with a scripted sort clause.

* iter
2025-03-11 22:40:47 +11:00
Mary Gouseti
85c113867a
Remove test usages of DataStream#getDefaultBackingIndexName in ILM integration tests (#124319) (#124467) (#124473)
* Incorporate review comments

(cherry picked from commit 44dd44bd2d)

# Conflicts:
#	x-pack/plugin/ilm/qa/multi-node/src/javaRestTest/java/org/elasticsearch/xpack/ilm/TimeSeriesDataStreamsIT.java
#	x-pack/plugin/ilm/qa/multi-node/src/javaRestTest/java/org/elasticsearch/xpack/ilm/TimeSeriesLifecycleActionsIT.java
#	x-pack/plugin/ilm/qa/multi-node/src/javaRestTest/java/org/elasticsearch/xpack/ilm/actions/SearchableSnapshotActionIT.java
2025-03-11 00:28:19 +11:00
Ryan Ernst
426b9810b5
Set root logger level for CLIs (#123742) (#123818)
All CLIs in elasticsearch support command line flags for controlling the
output level. When --silent is used, the expectation is that normal
logging is omitted. Yet the log4j logger is still configured to output
error level logs. This commit sets the appropriate log level for log4j
depending on the Terminal log level.
2025-03-03 06:09:59 +11:00
Moritz Mack
b585bb1196
fix testReadBlobWithPrematureConnectionClose jdk24 (#122655) (#123596)
(cherry picked from commit 8b4f159aa2)

Co-authored-by: Mikhail Berezovskiy <mikhail@elastic.co>
2025-02-27 23:48:40 +11:00
David Turner
3df75e008e
Reduce licence checks in LicensedWriteLoadForecaster (#123369) (#123407)
Rather than checking the license (updating the usage map) on every
single shard, just do it once at the start of a computation that needs
to forecast write loads.

Backport of #123346 to 8.x
Closes #123247
2025-02-26 07:08:22 +11:00
Albert Zaharovits
7fb4763ee1
Invoke TestCluster#assertAfterTest before closing the cluster (#122639) (#122709)
In test-scoped internal ITs the `cluster().assertAfterTest()` method was invoked
*after* the cluster nodes were closed. Consequently, the assertions that iterated
over the internal nodes (and asserted some state on nodes after the test) were
all effectively noops.
This PR reverses that order, so that after-test assertions are effective again.
2025-02-16 20:59:10 +11:00
Simon Cooper
4490510833
Add a parameter to describe the lambda in a transformedMatch matcher (#122013) (#122063) 2025-02-10 21:15:07 +11:00
Mark Tozzi
8ef2869b7d
Aggregations cancellation after collection (#120944) (#121970)
This PR addresses issues around aggregations cancellation, mentioned in https://github.com/elastic/elasticsearch/issues/108701 and other places. In brief, during aggregations collection time, we respect cancellation via the mechanisms in the searcher to poison cancelled queries. But once the aggregation finishes collection, there is no further need to interact with the searcher, so we cannot rely on that for cancellation checking. In particular, deeply nested aggregations can spend a long time constructing the results tree.

Checking for cancellation is a trade off, as the check itself is somewhat expensive (it involves a volatile read), so we want to balance checking often enough that cancelled queries aren't taking up resources for a long time, but not so frequently that it slows down most aggregation queries. Our first attempt to this is to check once when we go to build sub-aggregations, as the worst cases for this that we've seen involve needing to build deep sub-aggregation trees. Checking at sub-aggregation construction time also provides a conveniently centralized method call to add the check to.

---------



 Conflicts:
	test/framework/src/main/java/org/elasticsearch/search/aggregations/AggregatorTestCase.java

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-02-07 09:55:19 -05:00
Mark Vieira
793e819c43
Upgrade mockito (#121849) (#121929) 2025-02-06 12:22:31 -08:00
Ryan Ernst
3537349096
Rename environment dir accessors (#121803) (#121836)
* Rename environment dir accessors (#121803)

The node environment has many paths. The accessors for these currently use a "file" suffix, but they are always directories. This commit renames the accessors to make it clear these paths are directories.

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-02-06 10:28:25 +11:00
Christoph Büscher
7990781848
Fix rare failures in YAML xContent roundtrip tests (#121515) (#121685)
Under very unfortunate conditions tests that check xContent objects
roundtrip parsing  (like i.e. SearchHitsTests#testFromXContent)
can fail when we happen to randomly pick YAML xContent type and create
random (realistic)Unicode character sequences that may contain the
character U+0085 (133) from the Latin1 code page. That specific character
doesn't get parsed back to its original form for YAML xContent, which can 
lead to rare but hard to diagnose test failures.

This change adds logic to AbstractXContentTestCase#test() which lies at
the core of most of our  xContent roundtrip tests that disallows test
instances containing that particular character  when using YAML xContent
type.

Closes #97716
2025-02-05 05:28:12 +11:00
Christoph Büscher
1ea390c053 Revert "WIP (#121463) (#121470)"
This reverts commit 2f2addb7ac.
2025-02-03 11:13:36 +01:00
Christoph Büscher
2f2addb7ac
WIP (#121463) (#121470)
Under very unfortunate conditions tests that check xContent objects
roundtrip parsing  (like i.e. [SearchHitsTests
testFromXContent](https://github.com/elastic/elasticsearch/issues/97716)
can fail when we happen to randomly pick YAML xContent type and create
random (realistic)Unicode character sequences that may contain the
character U+0085 (133) from the [Latin1 code
page](https://de.wikipedia.org/wiki/Unicodeblock_Lateinisch-1,_Erg%C3%A4nzung).

That specific character doesn't get parsed back to its original form for
YAML xContent, which can lead to [rare but hard to diagnose test
failures](https://github.com/elastic/elasticsearch/issues/97716#issuecomment-2464465939).

This change adds logic to AbstractXContentTestCase#test() which lies at
the core of most of our  xContent roundtrip tests that disallows test
instances containing that particular character  when using YAML xContent
type.

Closes #97716
2025-02-01 11:13:32 +11:00
Moritz Mack
e3a30d6a59
Fix LambdaMatchers.transformedMatch to handle null values (#121371) (#121375) 2025-01-31 21:41:45 +11:00
Armin Braun
1261557a38
Remove redundant LatchedActionListener from ESIntegTestCase (#121244) (#121262)
This is effectively the same as the other class. The logging is
irrelevant and the dead `addError` is too => lets remove this.
2025-01-30 21:12:59 +11:00
Larisa Motova
14c90f6a9e
[ES|QL] Support some stats on aggregate_metric_double (#120343) (#121245)
Adds non-grouping support for min, max, sum, and count, using
CompositeBlock as the underlying block type and an internal
FromAggregateMetricDouble function to handle converting from
CompositeBlock to the correct metric subfields.

Closes #110649
2025-01-30 10:46:21 +11:00
Dimitris Rempapis
61f5b01200
[8.x] Test/107515 RestoreTemplateWithMatchOnlyTextMapperIT (#120898)
This is a manual backport of #120392 to 8.x
2025-01-30 02:01:21 +11:00
Oleksandr Kolomiiets
1ee186846b
Fix matching of half_float and scaled_float values in LogsDB tests (#121098) (#121140)
Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>
2025-01-30 01:31:34 +11:00
Oleksandr Kolomiiets
8c991077ae
[8.x] Support ignore_above for keywords in test data generation (#119416) (#121087)
* Support ignore_above for keywords in test data generation (#119416)

(cherry picked from commit d3f2956116)

* Update DefaultMappingParametersHandler.java
2025-01-28 20:57:02 +01:00
Panagiotis Bailis
751c1c52d3
Normalize negative scores for text_similarity_reranker retriever (#120930) (#121050) 2025-01-29 03:11:51 +11:00
Moritz Mack
37fa39d9f6
[8.x] Added query param ?include_source_on_error for ingest requests (#120725) (#121010)
A new query parameter `?include_source_on_error` was added for create / index,
update and bulk REST APIs to control if to include the document source
in the error response in case of parsing errors. The default value is `true`.

Relates to ES-9186.
2025-01-28 15:15:08 +01:00
Kostas Krikellas
4a0f1df81d
[TEST] Restore copy_to, double and float randomized testing (#120906) (#120922)
Partial rollback of #120859, these data types seem fine.
2025-01-28 06:48:03 +11:00
Slobodan Adamović
7245c05a44
[8.x] Enable queryable built-in roles feature by default (#120323) (#120886)
* Enable queryable built-in roles feature by default (#120323)

Making the `es.queryable_built_in_roles_enabled` feature flag enabled by default.
This feature makes the built-in roles automatically indexed in `.security` index and available
for querying via Query Role API. The consequence of this is that `.security` index is now
created eagerly (if it's not existing) on cluster formation.

In order to keep the scope of this PR small, the feature is disabled for some of the tests,
because they are either non-trivial to adjust or the gain is not worthy the effort to do it now.
The tests will be adjusted in a follow-up PR and later the flag will be removed completely.

Relates to #117581

(cherry picked from commit 52e0f21bdd)

# Conflicts:
#	modules/dot-prefix-validation/build.gradle
#	test/framework/src/main/java/org/elasticsearch/test/InternalTestCluster.java
#	x-pack/plugin/security/src/internalClusterTest/java/org/elasticsearch/xpack/security/authc/esnative/ReservedRealmElasticAutoconfigIntegTests.java

* Update InternalTestCluster.java

remove line snuck after resolving merge confilcs

* Update build.gradle

fix build.gradle

* Update build.gradle

fix build.gradle by removing invalid task

* remove non-existing timeout parameter on 8.x branch
2025-01-27 23:40:15 +11:00
Jordan Powers
250c32bc54
Counted keyword: inherit source keep mode from index settings (#120678) (#120871)
This patch adds a property to CountedKeywordMapper to track the
synthetic_source_keep index setting. This property is then used to properly
implement synthetic source support in the counted_keyword field type, with
fallback to the ignore_source mechanism when synthetic_source_keep is set
in either the field mapping or the index settings.
2025-01-27 14:22:46 +11:00
Kostas Krikellas
7585952521
[8.x] Skip flaky configuration in randomized testing for logsdb (#120859) (#120860)
* Skip flaky configuration in randomized testing for logsdb (#120859)

(cherry picked from commit 1cb2a65e19)

# Conflicts:
#	muted-tests.yml

* Update muted-tests.yml
2025-01-25 22:19:16 +11:00
Oleksandr Kolomiiets
19e6a3e617
[TEST] Restore scaled_float and half_float data generation (#120756) (#120841) 2025-01-25 09:58:34 +11:00
Kostas Krikellas
8d96ccd376
Restore source matching in randomized logsdb tests (#120773) (#120822)
Applies the fix in `SourceMatcher` from #120756, along with disabling
`SCALED_FLOAT` and `HALF_FLOAT` that have accuracy issues leading to
false positives.
2025-01-25 05:52:33 +11:00
Joe Gallo
a491383940
Optimize IngestDocMetadata isAvailable (#120753) (#120801) 2025-01-25 02:51:42 +11:00
Panagiotis Bailis
9d5c474f93
Avoid populating rank docs metadata if explain is not specified (#120536) (#120766) 2025-01-24 23:32:19 +11:00
Stanislav Malyshev
b53e2949bb
[8.x] ES|QL async queries: Partial result on demand (#118122) (#120745)
* ES|QL async queries: Partial result on demand (#118122)

Add capability to stop async query on demand
The theory:

- User initiates async search request
- User sends the stop request (POST _query/async/<ID>/stop)
- If the async is finished by that time, it's like regular async get
- If it's not finished, the sinks are closed and the request is forcefully finished

(cherry picked from commit f27f74666f)

# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/action/EsqlQueryResponse.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/action/EsqlQueryResponseTests.java
#	x-pack/plugin/security/qa/multi-cluster/src/javaRestTest/java/org/elasticsearch/xpack/remotecluster/CrossClusterEsqlRCS1UnavailableRemotesIT.java
#	x-pack/plugin/security/qa/multi-cluster/src/javaRestTest/java/org/elasticsearch/xpack/remotecluster/CrossClusterEsqlRCS2UnavailableRemotesIT.java

* fix tests

* [CI] Auto commit changes from spotless

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-24 07:48:18 +11:00
Ignacio Vera
152af43481
[8.x] Stop caching source map on SearchHit#getSourceMap (#119888) (#120743)
This call has the side effect that if you are iterating a number of hits
calling this method, you will be increasing the memory usage by a non
trivial number which in most of cases is unwanted. Therefore this commit
removes this caching all together and add an assertion so the method is
call once during the lifetime of the object.

backport #119888
2025-01-24 06:18:44 +11:00
Simon Cooper
fa1cac3a60
Update the index version compatible test to only check the minimum (#120406) (#120738) 2025-01-24 04:08:31 +11:00
Nik Everett
6d2106d31f
ESQL: Unit tests for LOOKUP (#120559) (#120719)
This adds a unit test for LOOKUP that's going to be quite good at
finding leaks.
2025-01-23 10:55:18 -05:00
Oleksandr Kolomiiets
9efdf82ea5
Don't generate mappings that copy_to into itself (#119997) (#120664) 2025-01-23 07:34:20 +11:00
Jan Kuipers
dc66c15bc0
[8.x] Test ML model server (#120270) (#120586)
* Test ML model server (#120270)

* Fix model downloading for very small models.

* Test MlModelServer

* Tiny ELSER

* unmute TextEmbeddingCrudIT and DefaultEndPointsIT

* update ELSER

* Improve MlModelServer

* tiny E5

* more logging

* improved E5 model

* tiny reranker

* scan for ports

* [CI] Auto commit changes from spotless

* Serve default models when optimized model is requested

* @ClassRule

* polish code

* Respect dynamic setting ML model repo

* fix metadata for optimized models

* improve logging

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>

* backport HttpHeaderParser

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-22 22:29:56 +11:00
Joe Gallo
82c70ac3b7
Rename test utility methods (#120213) (#120227) 2025-01-16 06:54:09 +11:00
Jim Ferenczi
99539866bc
[8.x] Rebuild Inference Metadata Fields During Snapshot Recovery (#120204)
* Rebuild Inference Metadata Fields During Snapshot Recovery (#120045)

This PR introduces support for reconstructing inference metadata fields that are removed from `_source` by `SourceFieldMapper#applyFilters` during operations recovery.
The inference metadata fields are retrieved using value fetchers and are re-added to `_source` under the `_inference_fields` metadata field.

* fix compil
2025-01-16 05:02:35 +11:00
David Turner
1e84e0d06e
Remove unused Transport#version field (#120202) (#120217) 2025-01-16 04:50:57 +11:00
Salvatore Campagna
051305f259
Move source mode setting from SourceFieldMapper to IndexSettings (#120096) (#120118)
Here we move the `index.mapping.source.mode` setting to `IndexSettings` because of dependencies
and because of the initialisation order of static fields for classes `IndexSettings` and `SourceFieldMapper`.
Not initialising settings `index.mode`, `index.mapping.source.mode`, and `index.recovery.use_synthetic_source`
in the right order results in multiple `NullPointerException`.

This work is done to simplify another PR #119110
2025-01-15 02:17:01 +11:00
Orestis Floros
b38748edc7
Permissions required for stateful agentless integrations (#118644) (#119973)
Closes elastic/security-team#11102
Closes elastic/security-team#11104

This allows agentless integrations (via elastic/beats#41446, elastic/kibana#203810) to write to agentless-* indices. Each index is created on-demand by the filebeat client and kibana conditionally extends the API key permissions to allow writing to the index.

(cherry picked from commit 3c184b912c)

# Conflicts:
#	docs/reference/rest-api/security/get-service-accounts.asciidoc
#	x-pack/plugin/security/qa/service-account/src/javaRestTest/java/org/elasticsearch/xpack/security/authc/service/ServiceAccountIT.java
#	x-pack/plugin/security/src/main/java/org/elasticsearch/xpack/security/authc/service/ElasticServiceAccounts.java
2025-01-13 17:44:49 +00:00
Ryan Ernst
940ad90304
Do not try to enable SecurityManager on JDK 24 (#117999) (#119975)
* Do not try to enable SecurityManager on JDK 24 (#117999)

* cleanup

* [CI] Auto commit changes from spotless

* more

* [CI] Auto commit changes from spotless

---------

Co-authored-by: Lorenzo Dematté <lorenzo.dematte@elastic.co>
Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-11 07:09:15 +11:00
Albert Zaharovits
a5811d05f6
[8.x] Metrics for indexing failures due to version conflicts (#119067) (#119761)
* Metrics for indexing failures due to version conflicts (#119067)

This exposes new OTel node and index based metrics for indexing failures due to version conflicts.

In addition, the /_cat/shards, /_cat/indices and /_cat/nodes APIs also expose the same metric, under the newly added column iifvc.

Relates: #107601
(cherry picked from commit 12eb1cfda1)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java

* types

* Fix NodeIndexingMetricsIT

* [CI] Auto commit changes from spotless

* Fix RestShardsActionTests

* Fix test/cat.shards/10_basic.yml for bwc

---------

Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
2025-01-10 02:25:57 +11:00
Felix Barnsteiner
d98a29ffcc
Add missing traces ilm policy for OTel traces data streams (#119449) (#119824) 2025-01-09 16:19:32 +02:00
Benjamin Trent
8555b350cc
[8.x] Add new experimental rank_vectors mapping for late-interaction second order ranking (#118804) (#119601)
* Add new experimental rank_vectors mapping for late-interaction second order ranking (#118804)

Late-interaction models are powerful rerankers. While their size and
overall cost doesn't lend itself for HNSW indexing, utilizing them as
second order "brute-force" reranking can provide excellent boosts in
relevance. At generally lower inference times than large cross-encoders.


This commit exposes a new experimental `rank_vectors` field that allows
for maxSim operations. This unlocks the initial, and most common use of
late-interaction dense-models. 

For example, this is how you would use it via the API:

```
PUT index
{
  "mappings": {
    "properties": {
      "late_interaction_vectors": {
        "type": "rank_vectors"
      }
    }
  }
}
```

Then to index:

```
POST index/_doc
{
  "late_interaction_vectors": [[0.1, ...],...]
}
```

For querying, scoring can be exposed with scripting:

```
POST index/_search
{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": "maxSimDotProduct(params.query_vector, 'my_vector')",
        "params": {
          "query_vector": [[0.42, ...], ...]
        }
      }
    }
  }
}
```

Of course, the initial ranking should be done before re-scoring or
combining via the `rescore` parameter, or simply passing whatever first
phase retrieval you want as the inner query in `script_score`.

* Update docs/changelog/119601.yaml
2025-01-07 05:19:38 +11:00
Nhat Nguyen
4f7ea81c66
Adjust translog operation assertions for synthetic source (#119330) (#119559)
When synthetic sources are used in peer recoveries, the translog
operations via peer recoveries may differ from those created through
replication. This change relaxes the translog operation assertion to
account for synthetic source, allowing these operations to be considered
equivalent.

Closes #119191
2025-01-05 06:51:58 +11:00
Stanislav Malyshev
4012aec905
Add ESQL telemetry collection (#119474) (#119479)
* Add ESQL telemetry collection

(cherry picked from commit 0292905ef6)

# Conflicts:
#	server/src/main/java/org/elasticsearch/TransportVersions.java
2025-01-03 09:30:28 +11:00