Commit graph

14959 commits

Author SHA1 Message Date
Mike Pellegrini
3de109e196
[8.17] Update Text Similarity Reranker to Properly Handle Aliases (#120062) (#120076)
* Update Text Similarity Reranker to Properly Handle Aliases (#120062)

(cherry picked from commit 264d1c29d4)

# Conflicts:
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java

* Fix compilation error
2025-01-14 07:28:11 +11:00
David Turner
2e6c05077c
Fix MasterServiceTests#testThreadContext (#118926) (#119304)
This test would fail to see the expected response headers if the task
timed out before it started executing, which could happen very rarely.
It's also not a very good test because it never actually executed any of
the paths involving acking.

This commit fixes the rare failure and tightens up the assertions to
verify that it does indeed see the right thread context while handling
the end of the acking process, and indeed that it always completes the
acking process.

Closes #118914
2025-01-14 02:59:18 +11:00
Michael Peterson
03231d2b00
Resolve/cluster should mark remotes as not connected when a security exception is thrown (#119793) (#119865)
Fixes two bugs in _resolve/cluster.

First, the code that detects older clusters versions and does a fallback to the _resolve/index
endpoint was using an outdated string match for error detection. That has been adjusted.

Second, upon security exceptions, the _resolve/cluster endpoint was marking the clusters as connected: true,
under the assumption that all security exceptions related to cross cluster calls and remote index access were
coming from the remote cluster, but that is not always the case. Some cross-cluster security violations can
be detected on the local querying cluster after issuing the remoteClient.execute call but before the transport
layer actually sends the request remotely. So we now mark the connected status as false for all ElasticsearchSecurityException cases. End user docs have been updated with this information.
2025-01-10 02:04:27 +11:00
Ignacio Vera
dcad38eceb
Construct list manually in AggregatorsReducer#get (#119565) (#119566) 2025-01-06 00:18:55 +11:00
Pawan Kartik
9cb1ac2ad7
fix: do not let _resolve/cluster hang if remote is unresponsive (#119516) (#119527)
* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
2025-01-04 04:51:33 +11:00
Rene Groeschke
49c0f5cf71
Update Gradle wrapper to 8.12 (#118683) (#119356)
This updates the gradle wrapper to 8.12

We addressed deprecation warnings due to the update that includes:

- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions

(cherry picked from commit ba61f8c7f7)

# Conflicts:
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/distribution/DockerUbiElasticsearchDistributionType.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/Fixture.java
#	plugins/repository-hdfs/hadoop-client-api/build.gradle
#	qa/entitlements/build.gradle
#	server/src/main/java/org/elasticsearch/indices/IndicesFeatures.java
#	x-pack/plugin/migrate/build.gradle
#	x-pack/plugin/security/qa/security-basic/build.gradle
2024-12-31 08:37:12 +01:00
Salvatore Campagna
d86b6f47e3
[8.17] Replace encoder with url encoder (#116699) (#119080)
Document IDs are frequently used in HTTP requests, such as `GET /index/_doc/{id}`, where they must be URL-safe to avoid issues with invalid characters. This change ensures that IDs generated by `TimeBasedKOrderedUUIDGenerator` are properly Base64 URL-encoded, free of characters that could break URLs. We also test that no IDs include invalid characters like +, /, or = to guarantee they are fully compliant with URL-safe requirements.

Moreover `TimeBasedKOrderedUUIDGenerator` and `TimeBasedUUIDGenerator` are refactored to allow injection of dependencies which enables us to increase test coverage by including tests for high-throughput scenarios, sequence id overflow and unreliable clocks usage.
2024-12-19 16:32:32 +01:00
Martijn van Groningen
44a05e5d15
Support flattened label field with downsampling. (#118816) (#118935)
If flattened field is configured as non-dimension and non-metric field, then downsampling fails to execute successfully. Downsampling doesn't know how to use the flattened field or how to serialize it. This change addresses this.

Closes #116319
2024-12-18 22:51:21 +11:00
Brian Seeders
d42f07bb6e
Bump versions after 8.16.2 release 2024-12-17 13:34:31 -05:00
Ignacio Vera
3be3c1ee06
Fix moving function linear weighted avg (#118516) (#118751) (#118759)
Fix moving function linear weighted avg

Co-authored-by: Quentin Deschamps <quentindeschamps18@orange.fr>
# Conflicts:
#	server/src/main/java/org/elasticsearch/rest/action/search/SearchCapabilities.java
2024-12-16 14:09:53 +01:00
elasticsearchmachine
306ef1e8bd Bump versions after 8.17.0 release 2024-12-13 15:58:17 +00:00
Mark Vieira
0f7a05690e
Remove version 8.15.6 2024-12-12 11:25:59 -08:00
Luca Cavanna
5632800796
[8.x] Handle all exceptions in data nodes can match (#117469) (#118533) (#118572)
* Handle all exceptions in data nodes can match (#117469)

During the can match phase, prior to the query phase, we may have exceptions
that are returned back to the coordinating node, handled gracefully as if the
shard returned canMatch=true.

During the query phase, we perform an additional rewrite and can match phase
to eventually shortcut the query phase for the shard. That needs to handle
exceptions as well. Currently, an exception there causes shard failures, while
we should rather go ahead and execute the query on the shard.

Instead of adding another try catch on consumers code, this commit adds exception handling to the method itself so that it can no longer throw exceptions and similar mistakes can no longer be made in the future.

At the same time, this commit makes the can match method more easily testable without requiring a full-blown SearchService instance.

Closes #104994

* fix compile
2024-12-13 02:25:42 +11:00
Mike Pellegrini
f14bbea8fe
Restore original "is within leaf" value in SparseVectorFieldMapper (#118380) (#118457) 2024-12-12 01:29:48 +11:00
Joe Gallo
eac54624fb
Fix log message format bugs (#118354) (#118386) 2024-12-11 09:09:35 +11:00
Mark Vieira
f5c8762ce1
Update BWC version logic to support multiple bugfix versions (#117943) (#118116) (#118118)
(cherry picked from commit 7070e95fa7)

# Conflicts:
#	.buildkite/pipelines/intake.yml
#	.buildkite/pipelines/periodic.yml
#	.ci/snapshotBwcVersions
#	build-tools-internal/src/integTest/groovy/org/elasticsearch/gradle/internal/InternalDistributionBwcSetupPluginFuncTest.groovy
#	build-tools-internal/src/integTest/groovy/org/elasticsearch/gradle/internal/test/rest/LegacyYamlRestCompatTestPluginFuncTest.groovy
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/BwcVersions.java
#	build-tools-internal/src/test/groovy/org/elasticsearch/gradle/internal/BwcVersionsSpec.groovy
(cherry picked from commit c5d3799af1)

# Conflicts:
#	.buildkite/pipelines/intake.yml
#	.buildkite/pipelines/periodic.yml
2024-12-06 12:12:36 +11:00
Martijn van Groningen
7cb1cbe0f6
[8.17] Address mapping and compute engine runtime field issues (#117792) (#118048)
* Address mapping and compute engine runtime field issues (#117792)

This change addresses the following issues:

Fields mapped as runtime fields not getting stored if source mode is synthetic.
Address java.io.EOFException when an es|ql query uses multiple runtime fields that fallback to source when source mode is synthetic. (1)
Address concurrency issue when runtime fields get pushed down to Lucene. (2)
1: ValueSourceOperator can read values in row striding or columnar fashion. When values are read in columnar fashion and multiple runtime fields synthetize source then this can cause the same SourceProvider evaluation the same range of docs ids multiple times. This can then result in unexpected io errors at the codec level. This is because the same doc value instances are used by SourceProvider. Re-evaluating the same docids is in violation of the contract of the DocIdSetIterator#advance(...) / DocIdSetIterator#advanceExact(...) methods, which documents that unexpected behaviour can occur if target docid is lower than current docid position.

Note that this is only an issue for synthetic source loader and not for stored source loader. And not when executing in row stride fashion which sometimes happen in compute engine and always happen in _search api.

2: The concurrency issue that arrises with source provider if source operator executes in parallel with data portioning set to DOC. The same SourceProvider instance then gets access by multiple threads concurrently. SourceProviders implementations are not designed to handle concurrent access.

Closes #117644

* fixed compile error after backporting
2024-12-05 20:52:13 +11:00
Panagiotis Bailis
d4da8ea1a9
[8.17] Fix for propagating filters from compound to inner retrievers (#117914) (#118045)
* Fix for propagating filters from compound to inner retrievers (#117914)

* Update RRFRetrieverBuilderIT.java
2024-12-05 19:43:50 +11:00
elasticsearchmachine
991bb104ff Bump versions after 7.17.26 release 2024-12-04 16:06:22 +00:00
Felix Barnsteiner
f4df8741c1
Fix false positive date detection with trailing dot (#116953) (#117959) 2024-12-04 19:10:06 +11:00
Kostas Krikellas
417f3bb768
Parse the contents of dynamic objects for [subobjects:false] (#117762) (#117921)
* Parse the contents of dynamic objects for [subobjects:false]

* Update docs/changelog/117762.yaml

* add tests

* tests

* test dynamic field

* test dynamic field

* fix tests

(cherry picked from commit f2addbc69a)

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/mapper/MapperFeatures.java
2024-12-04 06:27:21 +11:00
Luca Cavanna
f246c80d41
[8.17] Don't skip shards in coord rewrite if timestamp is an alias (#117271) (#117855)
* Don't skip shards in coord rewrite if timestamp is an alias (#117271)

The coordinator rewrite has logic to skip indices if the provided date range
filter is not within the min and max range of all of its shards. This mechanism
is enabled for event.ingested and @timestamp fields, against searchable snapshots.

We have basic checks that such fields need to be of date field type, yet if they
are defined as alias of a date field, their range will be empty, which indicates
that the shards are empty, and the coord rewrite logic resolves the alias and
ends up skipping shards that may have matching docs.

This commit adds an explicit check that declares the range UNKNOWN instead of EMPTY
in these circumstances. The same check is also performed in the coord rewrite logic,
so that shards are no longer skipped by mistake.

* fix compile
2024-12-03 22:59:56 +11:00
Armin Braun
205675dbcd
Fix race in AbstractSearchAsyncAction request throttling (#116264) (#117638)
We had a race here where the non-blocking pending execution
would be starved of executing threads.
This happened when all the current holders of permits from the semaphore
would release their permit after a producer thread failed to acquire a
permit and then enqueued its task.
=> need to peek the queue again after releasing the permit and try to
acquire a new permit if there's work left to be done to avoid this
scenario.
2024-12-02 22:52:49 +11:00
Martijn van Groningen
7ed32c29a6
[8.17] Add source mode stats to MappingStats (#117697)
* Add source mode stats to MappingStats (#117463)

* update bwc logic for 8.17
2024-11-28 22:37:46 +11:00
Brian Seeders
23f424a920
Bump versions after 8.15.5 release 2024-11-27 13:51:04 -05:00
Nhat Nguyen
457f18eb76
Emit deprecation warnings only for new index or template (#117529) (#117651)
Currently, we emit a deprecation warning in the parser of the source 
field when source mode is used in mappings. However, this behavior 
causes warnings to be emitted for every mapping update. In tests with
assertions enabled, warnings are also triggered for every change to
index metadata. As a result, deprecation warnings are inadvertently
emitted for index or update requests.

This change relocates the deprecation check to the mapper, limiting it 
to cases where a new index is created or a template is created/updated.

Relates to #11752
2024-11-27 09:53:45 -08:00
Martijn van Groningen
52318d4beb
[8.17] Add has_custom_cutoff_date to logsdb usage. (#117550) (#117614)
Indicates whether es.mapping.synthetic_source_fallback_to_stored_source.cutoff_date_restricted_override system property has been configured.

A follow up from #116647
2024-11-28 00:45:13 +11:00
Nhat Nguyen
952df62bc2
Deprecate source mode in mappings (#117177) (#117527)
Backport of #116689 to 8.17

This change deprecates _source.mode in mappings, replacing it with
the index.mapping.source.mode index setting.
2024-11-25 20:26:52 -08:00
Benjamin Trent
0bc9af1202
Correct bit * byte and bit * float script comparisons (#117404) (#117508)
I goofed on the bit * byte and bit * float comparisons. Naturally, these
should be bigendian and compare the dimensions with the binary ones
appropriately.

Additionally, I added a test to ensure that this is handled correctly.

(cherry picked from commit 374c88a832)
2024-11-26 07:42:23 +11:00
Rene Groeschke
20a78a18e9
[8.17] [Gradle] Remove static use of BuildParams (#115122) (#117433)
* [Gradle] Remove static use of BuildParams (#115122)

Static fields dont do well in Gradle with configuration cache enabled.

- Use buildParams extension in build scripts
- Keep BuildParams.ci for now for easy serverless migration
-  Tweak testing doc

(cherry picked from commit 13c8aaeffa)

# Conflicts:
#	TESTING.asciidoc
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/RestTestBasePlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/compat/compat/AbstractYamlRestCompatTestPlugin.java
#	build.gradle
#	modules/ingest-geoip/qa/full-cluster-restart/build.gradle
#	qa/mixed-cluster/build.gradle
#	x-pack/plugin/ent-search/qa/full-cluster-restart/build.gradle
#	x-pack/plugin/eql/qa/rest/build.gradle
#	x-pack/plugin/fleet/qa/rest/build.gradle
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/mapper-unsigned-long/build.gradle
#	x-pack/plugin/ml/qa/multi-cluster-tests-with-security/build.gradle
#	x-pack/plugin/security/qa/multi-cluster/build.gradle
#	x-pack/plugin/sql/qa/jdbc/build.gradle
#	x-pack/plugin/transform/qa/multi-cluster-tests-with-security/build.gradle

* Some cleanup

* Update build.gradle

fix buildparams access
2024-11-25 18:29:26 +01:00
Benjamin Trent
606b1f51e2
Improve halfbyte transposition performance, marginally improving bbq performance (#117350) (#117383)
The transposition of the bits in half-byte queries for BBQ is pretty
convoluted and slow. This commit greatly simplifies & improves
performance for this small part of bbq queries and indexing.

Here are the results of a small JMH benchmark for this particular
function.

```
TransposeBinBenchmark.transposeBinNew     1024  thrpt    5  857.779 ± 44.031  ops/ms
TransposeBinBenchmark.transposeBinOrig    1024  thrpt    5   94.950 ±  2.898  ops/ms
```

While this is a huge improvement for this small function, the impact at
query and index time is only marginal. But, the code simplification
itself is enough to warrant this change in my opinion.
2024-11-26 01:45:20 +11:00
Oleksandr Kolomiiets
711eb919ed
Fix constand_keyword test run and properly test recent behavior change (#117284) (#117370) 2024-11-23 06:27:14 +11:00
Brian Seeders
03abef2bd0
Bump versions after 8.16.1 release 2024-11-21 16:17:30 -05:00
Stanislav Malyshev
c012a75c36
Fix long metric deserialize & add - auto-resize needs to be set manually (#117105) (#117170)
* Fix long metric deserialize & add - auto-resize needs to be set manually
2024-11-21 04:19:48 +11:00
Joe Gallo
bb33952367
Optimize PipelineConfiguration-checking ClusterStateListeners (#117038) (#117098) 2024-11-21 02:45:42 +11:00
Jan Kuipers
5e6303c25f
Propagate scoring function through random sampler (#116957) (#117162)
* Propagate scoring function through random sampler.

* Update docs/changelog/116957.yaml

* Correct score mode in random sampler weight

* Fix random sampling with scores and p=1.0

* Unit test with scores

* YAML test

* Add capability
2024-11-21 02:40:18 +11:00
Martijn van Groningen
1bc60acdef
Revert "Deprecate _source.mode in mappings (#117106)" (#117151)
This reverts #117106. Bwc tests fail, because older nodes are killed with the following error:

```
[2024-11-20T10:54:58,600][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v8.17.0-0] fatal error in thread [elasticsearch[v8.17.0-0
][clusterApplierService#updateTask][T#1]], exiting java.lang.AssertionError: provided source [{"_doc":{"_data_stream_timestamp":{"enabled":true},"_source":{},"properties":{"@timestamp":{"type":"date"},"k8s":{"properties":{"pod":{"properties":{"ip":{"type":"ip"},"name":{"type":"keyword"},"network":{"properties":{"rx":{"type":"long"},"tx":{"type":"long"}}},"uid":{"type":"keyword","time_series_dimension":true}}}}},"metricset":{"type":"keyword","time_series_dimension":true}}}}] differs from mapping [{"_doc":{"_data_stream_timestamp":{"enabled":true},"_source":{"mode":"synthetic"},"properties":{"@timestamp":{"type":"date"},"k8s":{"properties":{"pod":{"properties":{"ip":{"type":"ip"},"name":{"type":"keyword"},"network":{"properties":{"rx":{"type":"long"},"tx":{"type":"long"}}},"uid":{"type":"keyword","time_series_dimension":true}}}}},"metricset":{"type":"keyword","time_series_dimension":true}}}}]
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.mapper.DocumentMapper.<init>(DocumentMapper.java:66)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.newDocumentMapper(MapperService.java:588)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.mapper.MapperService.updateMapping(MapperService.java:346)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.index.IndexService.updateMapping(IndexService.java:840)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndicesAndUpdateShards(IndicesClusterStateService.java:583)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.indices.cluster.IndicesClusterStateService.doApplyClusterState(IndicesClusterStateService.java:306)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:260)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:544)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:530)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:503)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:157)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:956)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:218)
        at org.elasticsearch.server@9.0.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:184)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
        at java.base/java.lang.Thread.run(Thread.java:1575)
```

The `mode` parameter no longer gets serialized for new indices. However on the older nodes still serialize the `mode` parameter, which caused the menioned assertion to fail. Reverting for now and see how best to address this bwc serialization issue.

We can only stop serializing mode, when all nodes are on the same version.  Unfortunately we can't invoke `c.clusterTransportVersion().get()` from parser or builder, because that calling thread isn't allowed to call `clusterService.state()`.
2024-11-20 13:39:07 +01:00
Craig Taverner
d03917db45
[8.x] Added stricter range type checks and runtime warnings for ENRICH (#115091) (#117130)
* Added stricter range type checks and runtime warnings for ENRICH (#115091)

It has been noted that strange or incorrect error messages are returned if the ENRICH command uses incompatible data types, for example a KEYWORD with value 'foo' using in an int_range match: https://github.com/elastic/elasticsearch/issues/107357

This error is thrown at runtime and contradicts the ES|QL policy of only throwing errors at planning time, while at runtime we should instead set results to null and add a warning. However, we could make the planner stricter and block potentially mismatching types earlier.

However runtime parsing of KEYWORD fields has been a feature of ES|QL ENRICH since it's inception, in particular we even have tests asserting that KEYWORD fields containing parsable IP data can be joined to an ip_range ENRICH index.

In order to not create a backwards compatibility problem, we have compromised with the following:

* Strict range type checking at the planner time for incompatible range types, unless the incoming index field is KEYWORD
* For KEYWORD fields, allow runtime parsing of the fields, but when parsing fails, set the result to null and add a warning

Added extra tests to verify behaviour of match policies on non-keyword fields. They all behave as keywords (the enrich field is converted to keyword at policy execution time, and the input data is converted to keyword at lookup time).

* Fix compile error likely due to mismatched ordering of backports
2024-11-20 22:57:56 +11:00
Iraklis Psaroudakis
432e343767
Fast refresh indices to use search shards (#117111)
Backport of PR #116658.

Changes of this PR are ineffective for stateful, and backport is not
used in serverless. This is mostly to adopt the new transport version
in stateful to keep them consecutive.

Relates ES-9573
2024-11-20 18:33:39 +11:00
Nhat Nguyen
e63367eaec
Deprecate _source.mode in mappings (#116689) (#117106)
This change deprecates _source.mode in mappings, replacing it with the
index.mapping.source.mode index setting.
2024-11-20 17:51:22 +11:00
Yang Wang
d575db0cdb
[8.x] Skip eager reconciliation for empty routing table (#116903) (#117103)
* Skip eager reconciliation for empty routing table (#116903)

No need to start the eager reconciliation when the routing table is
empty. An empty routing table means either the cluster has no shards or
the state has not recovered. The eager reconciliation is not necessary
in both cases.

Resolves: #115885

* fix compilation
2024-11-20 13:06:38 +11:00
Ignacio Vera
8a1231c248
Use LongArray instead of long[] for owning ordinals when building Internal aggregations (#116874) (#117015)
This commit changes the signature of InternalAggregation#buildAggregations(long[]) to
InternalAggregation#buildAggregations(LongArray) to avoid allocations of humongous arrays.
2024-11-19 23:36:04 +11:00
Pete Gillin
f96686e7e6
Use assertThrows in ConfigurationUtilsTests (#116971) (#117004)
This was trying to assert that the code under test threw an exception
using the 'try-act-fail-catch-assert' pattern, only the 'fail' step
was missing, meaning that the tests would have incorrectly passed if
the method didn't throw.

This switches it to using `assertThrows`, which is less easy to get
wrong.
2024-11-19 21:30:02 +11:00
David Turner
d8e8b6dd2d
Split searchable snapshot into multiple repo operations (#116986)
Each operation on a snapshot repository uses the same `Repository`,
`BlobStore`, etc. instances throughout, in order to avoid the complexity
arising from handling metadata updates that occur while an operation is
running. Today we model the entire lifetime of a searchable snapshot
shard as a single repository operation since there should be no metadata
updates that matter in this context (other than those that are handled
dynamically via other mechanisms) and some metadata updates might be
positively harmful to a searchable snapshot shard.

It turns out that there are some undocumented legacy settings which _do_
matter to searchable snapshots, and which are still in use, so with this
commit we move to a finer-grained model of repository operations within
a searchable snapshot.

Backport of #116918 to 8.x
2024-11-19 08:58:20 +00:00
Benjamin Trent
ee11a00eba
[8.x] Fixing MultiDenseVectorScriptDocValuesTests tests (#116940) (#116976)
* Fixing MultiDenseVectorScriptDocValuesTests tests (#116940)

This fixes two test issues:

 - 1. Now the tests skip if the multi_dense_vector feature isn't enabled
 - 2. fixes silly bwc testing where we were testing for big-endian floats, which aren't possible.

closes: https://github.com/elastic/elasticsearch/issues/116862 closes:
https://github.com/elastic/elasticsearch/issues/116863
(cherry picked from commit 82c02de914)

* fixing backport
2024-11-19 06:46:16 +11:00
David Turner
af875218e6
Backport transport constants related to #116339 (#116965)
This change was reverted with a new transport protocol in `main` so we
must backport the new protocol versions to `8.x`.
2024-11-18 18:15:10 +01:00
David Turner
de0941d462
Improve message about insecure S3 settings (#116954)
Clarifies that insecure settings are stored in plaintext and must not be
used. Also removes the mention of the (wrong) system property from the
error message if insecure settings are not permitted.

Backport of #116915 to `8.x`
2024-11-19 03:17:52 +11:00
Luca Cavanna
e4f4c95442
Fix handling of time exceeded exception in fetch phase (#116676)
The fetch phase is subject to timeouts like any other search phase. Timeouts
may happen when low level cancellation is enabled (true by default), hence the
directory reader is wrapped into ExitableDirectoryReader and a timeout is
provided to the search request.

The exception that is used is TimeExceededException, but it is an internal
exception that should never be returned to the user. When that is thrown, we
need to catch it and throw error or mark the response as timed out depending
on whether partial results are allowed or not.
2024-11-18 15:08:31 +01:00
Salvatore Campagna
f0740f3550
Re-structure document ID generation favoring _id inverted index compression (#104683) (#116810)
This implementation restructures auto-generated document IDs to maximize compression within Lucene's terms dictionary. The key insight is placing stable or slowly-changing components at the start of the ID - the most significant bytes of the timestamp change very gradually (the first byte shifts only every 35 years, the second every 50 days). This careful ordering means that large sequences of IDs generated close in time will share common prefixes, allowing Lucene's Finite State Transducer (FST) to store terms more compactly.

To maintain uniqueness while preserving these compression benefits, the ID combines three elements: a timestamp that ensures time-based ordering, the coordinator's MAC address for cluster-wide uniqueness, and a sequence number for handling high-throughput scenarios. The timestamp handling is particularly robust, using atomic operations to prevent backwards movement even if the system clock shifts.

For high-volume indices generating millions of documents, this optimization can lead to substantial storage savings while maintaining strict guarantees about ID uniqueness and ordering.
2024-11-18 11:03:58 +01:00
Aurélien FOUCRET
61c9add011
[8.x] KQL query nested field support (#116467) (#116910)
* KQL query nested field support (#116467)

* Fix compile error in 8.x branch
2024-11-18 20:37:59 +11:00