Commit graph

14789 commits

Author SHA1 Message Date
Michael Peterson
87211f249c
Resolve/cluster should mark remotes as not connected when a security exception is thrown (#119793) (#119866)
Fixes two bugs in _resolve/cluster.

First, the code that detects older clusters versions and does a fallback to the _resolve/index
endpoint was using an outdated string match for error detection. That has been adjusted.

Second, upon security exceptions, the _resolve/cluster endpoint was marking the clusters as connected: true,
under the assumption that all security exceptions related to cross cluster calls and remote index access were
coming from the remote cluster, but that is not always the case. Some cross-cluster security violations can
be detected on the local querying cluster after issuing the remoteClient.execute call but before the transport
layer actually sends the request remotely. So we now mark the connected status as false for all ElasticsearchSecurityException cases. End user docs have been updated with this information.
2025-01-10 01:57:36 +11:00
Ignacio Vera
ff58e0cb51
Construct list manually in AggregatorsReducer#get (#119565) (#119568) 2025-01-06 00:04:44 +11:00
Pawan Kartik
347527a1c1
fix: do not let _resolve/cluster hang if remote is unresponsive (#119516) (#119526)
* fix: do not let `_resolve/cluster` hang if remote is unresponsive

Previously, `_resolve/cluster` would wait for a response from a remote
as part of the connection strategy. If the remote were to be
unresponsive, this API would wait until `netty` would terminate the
connection with a handshake exception. The threshold for terminating the
connection is `10s`. This means that the API would wait for `10s` before
determining that the remote is unresponsive. This strategy is now
replaced with a fail fast where a response is sent back to the user
immediately rather than waiting for a connection termination.

* Update docs/changelog/119516.yaml
2025-01-04 04:36:24 +11:00
Rene Groeschke
4d17b2193a
Update Gradle wrapper to 8.12 (#118683) (#119357)
This updates the gradle wrapper to 8.12

We addressed deprecation warnings due to the update that includes:

- Fix change in TestOutputEvent api
- Fix deprecation in groovy syntax
- Use latest ospackage plugin containing our fix
- Remove project usages at execution time
- Fix deprecated project references in repository-old-versions

(cherry picked from commit ba61f8c7f7)

# Conflicts:
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/distribution/DockerCloudElasticsearchDistributionType.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/distribution/DockerUbiElasticsearchDistributionType.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/Fixture.java
#	plugins/repository-hdfs/hadoop-client-api/build.gradle
#	server/src/main/java/org/elasticsearch/inference/ChunkingOptions.java
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/migrate/build.gradle
#	x-pack/plugin/security/qa/security-basic/build.gradle
2024-12-31 08:37:28 +01:00
David Turner
faaede77eb
Fix MasterServiceTests#testThreadContext (#118926) (#119306)
This test would fail to see the expected response headers if the task
timed out before it started executing, which could happen very rarely.
It's also not a very good test because it never actually executed any of
the paths involving acking.

This commit fixes the rare failure and tightens up the assertions to
verify that it does indeed see the right thread context while handling
the end of the acking process, and indeed that it always completes the
acking process.

Closes #118914
2024-12-27 21:21:04 +11:00
Brian Seeders
b13bcbc879
Bump versions after 8.16.2 release 2024-12-17 13:34:31 -05:00
Mark Vieira
c220869824
Remove version 8.15.6 2024-12-12 11:28:16 -08:00
Luca Cavanna
fd937ebf77
[8.x] Handle all exceptions in data nodes can match (#117469) (#118533) (#118570)
* Handle all exceptions in data nodes can match (#117469)

During the can match phase, prior to the query phase, we may have exceptions
that are returned back to the coordinating node, handled gracefully as if the
shard returned canMatch=true.

During the query phase, we perform an additional rewrite and can match phase
to eventually shortcut the query phase for the shard. That needs to handle
exceptions as well. Currently, an exception there causes shard failures, while
we should rather go ahead and execute the query on the shard.

Instead of adding another try catch on consumers code, this commit adds exception handling to the method itself so that it can no longer throw exceptions and similar mistakes can no longer be made in the future.

At the same time, this commit makes the can match method more easily testable without requiring a full-blown SearchService instance.

Closes #104994

* fix compile
2024-12-13 02:18:41 +11:00
Mike Pellegrini
dda00b7176
Restore original "is within leaf" value in SparseVectorFieldMapper (#118380) (#118456) 2024-12-12 01:23:44 +11:00
Joe Gallo
78fb1ec970
Fix log message format bugs (#118354) (#118387) 2024-12-11 08:59:16 +11:00
Luca Cavanna
226a8ceb8e
[8.16] Don't skip shards in coord rewrite if timestamp is an alias (#117271) (#117854)
* Don't skip shards in coord rewrite if timestamp is an alias (#117271)

The coordinator rewrite has logic to skip indices if the provided date range
filter is not within the min and max range of all of its shards. This mechanism
is enabled for event.ingested and @timestamp fields, against searchable snapshots.

We have basic checks that such fields need to be of date field type, yet if they
are defined as alias of a date field, their range will be empty, which indicates
that the shards are empty, and the coord rewrite logic resolves the alias and
ends up skipping shards that may have matching docs.

This commit adds an explicit check that declares the range UNKNOWN instead of EMPTY
in these circumstances. The same check is also performed in the coord rewrite logic,
so that shards are no longer skipped by mistake.

* fix compile
2024-12-10 07:07:07 +11:00
Panagiotis Bailis
e52a6f2010
[8.16] Fix for propagating filters from compound to inner retrievers (#117914) (#118047)
* Fix for propagating filters from compound to inner retrievers

* fix for lucene 9

* Update CompoundRetrieverBuilder.java

* Update CompoundRetrieverBuilder.java

* Update CompoundRetrieverBuilder.java

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-12-06 21:50:41 +11:00
Mark Vieira
82ebd0aeb5
Update BWC version logic to support multiple bugfix versions (#117943) (#118116) (#118119) 2024-12-05 16:15:12 -08:00
elasticsearchmachine
b560f290a6 Bump versions after 7.17.26 release 2024-12-04 16:06:22 +00:00
Brian Seeders
953d5830a9
Bump versions after 8.15.5 release 2024-11-27 13:51:04 -05:00
Rene Groeschke
581b9ab7c0
[8.16] [Gradle] Remove static use of BuildParams (#115122) (#117434)
* [Gradle] Remove static use of BuildParams (#115122)

Static fields dont do well in Gradle with configuration cache enabled.

- Use buildParams extension in build scripts
- Keep BuildParams.ci for now for easy serverless migration
-  Tweak testing doc

(cherry picked from commit 13c8aaeffa)

# Conflicts:
#	TESTING.asciidoc
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/InternalDistributionBwcSetupPlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/RestTestBasePlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/compat/compat/AbstractYamlRestCompatTestPlugin.java
#	build.gradle
#	modules/ingest-geoip/qa/full-cluster-restart/build.gradle
#	qa/mixed-cluster/build.gradle
#	x-pack/plugin/ent-search/qa/full-cluster-restart/build.gradle
#	x-pack/plugin/eql/qa/rest/build.gradle
#	x-pack/plugin/fleet/qa/rest/build.gradle
#	x-pack/plugin/kql/build.gradle
#	x-pack/plugin/mapper-unsigned-long/build.gradle
#	x-pack/plugin/ml/qa/multi-cluster-tests-with-security/build.gradle
#	x-pack/plugin/security/qa/multi-cluster/build.gradle
#	x-pack/plugin/sql/qa/jdbc/build.gradle
#	x-pack/plugin/transform/qa/multi-cluster-tests-with-security/build.gradle

* Fix merge

* [Build] Fix fips testing after buildparams rework (#116934)

* More Cleanup

* [Build] Fix checkstyle exclusions on windows (#115185)

* More merge fixes

* Delete x-pack/plugin/kql/build.gradle
2024-11-27 12:34:32 +01:00
Nhat Nguyen
f0469a406e
Prohibit changes to index mode, source, and sort settings during resize (#115812) (#115971) (#117445)
Relates to #115811, but applies to resize requests.

The index.mode, source.mode, and index.sort.* settings cannot be
modified during resize, as this may lead to data corruption or issues
retrieving _source. This change enforces a restriction on modifying
these settings during resize. While a fine-grained check could allow
equivalent settings, it seems simpler and safer to reject resize
requests if any of these settings are specified.
2024-11-25 16:40:33 +11:00
Brian Seeders
89d0acb899
Bump versions after 8.16.1 release 2024-11-21 16:17:30 -05:00
Lorenzo Dematté
b4597a250f
[8.16] Add a cluster listener to fix missing system index mappings after upgrade (#115771) (#116646)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-11-21 23:27:01 +11:00
Stanislav Malyshev
626d100f6b
Fix long metric deserialize & add - auto-resize needs to be set manually (#117105) (#117171)
* Fix long metric deserialize & add - auto-resize needs to be set manually
2024-11-21 04:24:57 +11:00
Jan Kuipers
6f3d15296a
Propagate scoring function through random sampler (#116957) (#117165)
* Propagate scoring function through random sampler.

* Update docs/changelog/116957.yaml

* Correct score mode in random sampler weight

* Fix random sampling with scores and p=1.0

* Unit test with scores

* YAML test

* Add capability
2024-11-21 03:03:55 +11:00
David Turner
f663886e25
Split searchable snapshot into multiple repo operations (#116987)
* Split searchable snapshot into multiple repo operations

Each operation on a snapshot repository uses the same `Repository`,
`BlobStore`, etc. instances throughout, in order to avoid the complexity
arising from handling metadata updates that occur while an operation is
running. Today we model the entire lifetime of a searchable snapshot
shard as a single repository operation since there should be no metadata
updates that matter in this context (other than those that are handled
dynamically via other mechanisms) and some metadata updates might be
positively harmful to a searchable snapshot shard.

It turns out that there are some undocumented legacy settings which _do_
matter to searchable snapshots, and which are still in use, so with this
commit we move to a finer-grained model of repository operations within
a searchable snapshot.

Backport of #116918 to 8.16

* Add end-to-end test for reloading S3 credentials

We don't seem to have a test that completely verifies that a S3
repository can reload credentials from an updated keystore. This commit
adds such a test.

Backport of #116762 to 8.16.
2024-11-19 20:59:19 +11:00
David Turner
a8e616f937
Improve message about insecure S3 settings (#116955)
Clarifies that insecure settings are stored in plaintext and must not be
used. Also removes the mention of the (wrong) system property from the
error message if insecure settings are not permitted.

Backport of #116915 to `8.16`
2024-11-18 18:41:18 +00:00
Luca Cavanna
267abe781d
Fix handling of time exceeded exception in fetch phase (#116676)
The fetch phase is subject to timeouts like any other search phase. Timeouts
may happen when low level cancellation is enabled (true by default), hence the
directory reader is wrapped into ExitableDirectoryReader and a timeout is
provided to the search request.

The exception that is used is TimeExceededException, but it is an internal
exception that should never be returned to the user. When that is thrown, we
need to catch it and throw error or mark the response as timed out depending
on whether partial results are allowed or not.
2024-11-18 15:08:08 +01:00
Mark J. Hoy
6486d4118b
backport #116357 to 8.16 (#116840) 2024-11-14 15:54:56 -05:00
Nikolaj Volgushev
b5b710a95a
Use retry logic and real file system in file settings ITs (#116392) (#116710)
Several file-settings ITs fail (rarely) with exceptions like:

```
java.nio.file.AccessDeniedException: C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempFile-001.tmp -> C:\Users\jenkins\workspace\platform-support\14\server\build\testrun\internalClusterTest\temp\org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT_5733F2A737542BE-001\tempDir-002\config\operator\settings.json |  

at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) |  
-- | --
  |   | at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) |  
  |   | at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:317) |  
  |   | at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:293) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at org.apache.lucene.tests.mockfile.FilterFileSystemProvider.move(FilterFileSystemProvider.java:144) |  
  |   | at java.nio.file.Files.move(Files.java:1430) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.writeJSONFile(SnaphotsAndFileSettingsIT.java:86) |  
  |   | at org.elasticsearch.reservedstate.service.SnaphotsAndFileSettingsIT.testRestoreWithPersistedFileSettings(SnaphotsAndFileSettingsIT.java:321)
```

This happens in Windows file systems, due to a race condition where the
file settings service is reading the settings file concurrently with the
test trying to modify it (a no-go in Windows). It turns out we have
already addressed this with a retry for one test suite
(https://github.com/elastic/elasticsearch/pull/91863), plus addressed a
related issue around mock windows file-systems misbehaving
(https://github.com/elastic/elasticsearch/pull/92653).

This PR extends the above fixes to all file-settings related ITs.

(cherry picked from commit 91559da015)

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-11-14 21:34:57 +11:00
Panagiotis Bailis
4d64d96a66
[8.16] Backporting propagating nested inner_hits to the parent compound retriever (#116718) 2024-11-13 16:48:35 +02:00
Felix Barnsteiner
9cb5795086
Ignore conflicting fields during dynamic mapping update (#114227) (#116679)
This fixes a bug when concurrently executing index requests that have different types for the same field.

(cherry picked from commit 9658940a51)
2024-11-13 04:53:36 +11:00
elasticsearchmachine
fc8b5fd35e Bump versions after 8.16.0 release 2024-11-12 16:47:20 +00:00
elasticsearchmachine
14a3e436db Bump versions after 8.15.4 release 2024-11-12 12:17:04 +00:00
Lorenzo Dematté
d2d1e05573
[8.16] Backporting full CompatibilityVersions to NodeInfo (#116576) 2024-11-11 15:17:45 +01:00
Andrei Dan
870e77aa63
[8.16] Validate missing shards after the coordinator rewrite (#116382) (#116490)
* Validate missing shards after the coordinator rewrite (#116382)

The coordinate rewrite can skip searching shards when the query filters
on `@timestamp`, event.ingested  or the _tier field.

We currently check for missing shards across all the indices that are
the query is running against however,  some shards/indices might not
play a role in the query at all after the coordinator rewrite.

This moves the check for missing shards **after** we've run the
coordinator rewrite so we validate only the  shards that will be
searched by the query.

(cherry picked from commit cd2433d60c)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>

* imports

* Adapt unit test for 8.16 to use @timestamp rewrite

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2024-11-10 19:05:17 +11:00
Nhat Nguyen
fca2f43756
Fallback to field-caps (#115977) (#116428)
This change falls back to the old field-caps action if the remote
cluster has not been updated to 8.16 or later.
2024-11-08 06:06:38 +11:00
Alexey Ivanov
84087bda71
[CI] JvmStatsTests testJvmStats failing (#116197) (#116427)
Fix test JvmStatsTests.testJvmStats (backport from main)
Fixes #116197
2024-11-08 04:40:11 +11:00
Nikolaj Volgushev
022be89607
[8.16] Fix race conditions in file settings service tests (#116309) (#116403)
# Backport

This will backport the following commits from `main` to `8.16`:
 - [Fix race conditions in file settings service tests (#116309)](https://github.com/elastic/elasticsearch/pull/116309)
2024-11-07 15:52:58 +01:00
Luca Cavanna
1414f98d54
[8.16] Fix testSearchConcurrencyDoesNotCreateMoreTasksThanThreads failure (#116269)
This test was somehow difficult to write in the first place. We had to come up
with a threshold of how many tasks max are going to be created, but that is
not that easy to calculate as it depends on how quickly such tasks can be created
and be executed.

We should have rather used a higher threshold to start with, the important part
is anyways that we create a total of tasks that is no longer dependent on the
number of segments, given there are much less threads available to execute them.

Closes #116048
2024-11-05 19:54:22 +01:00
Kostas Krikellas
e80a641f36
[8.16] Track source for objects and fields with [synthetic_source_keep:arrays] in arrays as ignored (#116065) (#116226)
* Track source for objects and fields with [synthetic_source_keep:arrays] in arrays as ignored (#116065)

* Track source for objects and fields with [synthetic_source_keep:arrays] in arrays as ignored

* Update TransportResumeFollowActionTests.java

* rest compat fixes

* rest compat fixes

* update test

(cherry picked from commit 6cf45366d5)

# Conflicts:
#	rest-api-spec/build.gradle
#	rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/indices.create/21_synthetic_source_stored.yml
#	server/src/main/java/org/elasticsearch/index/mapper/DocumentParser.java
#	server/src/main/java/org/elasticsearch/index/mapper/DocumentParserContext.java
#	server/src/test/java/org/elasticsearch/index/mapper/IgnoredSourceFieldMapperTests.java

* Update DocumentParserContext.java

* fixes
2024-11-05 11:55:04 +02:00
Parker Timmins
ba80c4cabc
[8.16] Resolve pipelines from template if lazy rollover write (#116031) (#116131)
* Resolve pipelines from template if lazy rollover write  (#116031)

If datastream rollover on write flag is set in cluster state, resolve pipelines from templates rather than from metadata. This fixes the following bug: when a pipeline reroutes every document to another index, and rollover is called with lazy=true (setting the rollover on write flag), changes to the pipeline do not go into effect, because the lack of writes means the data stream never rolls over and pipelines in metadata are not updated. The fix is to resolve pipelines from templates if the lazy rollover flag is set. To improve efficiency we only resolve pipelines once per index in the bulk request, caching the value, and reusing for other requests to the same index.

Fixes: #112781

* Remute tests blocking merge

* Remute tests blocking merge
2024-11-03 04:12:06 +11:00
Ievgen Degtiarenko
b99189d6de
Prevent multiple sets copies while adding index aliases (#115934) (#116066) 2024-11-01 20:46:08 +11:00
Luca Cavanna
6a1a09de11
[8.x] Limit the number of tasks that a single search can submit (#115932) (#115981)
Since we removed the search workers thread pool with #111099, we execute many
more tasks in the search thread pool, given that each shard search request
parallelizes across slices or even segments (knn query rewrite. There are also
rare situations where segment level tasks may parallelize further
(e.g. createWeight), that cause the creation of many many tasks for a single
top-level request. These are rather small tasks that previously queued up in
the unbounded search workers queue. With recent improvements in Lucene,
these tasks queue up in the search queue, yet they get executed by the caller
thread while they are still in the queue, and remain in the queue as no-op
until they are pulled out of the queue. We have protection against rejections
based on turning off search concurrency when we have more than maxPoolSize
items in the queue, yet that is not enough if enough parallel requests see
an empty queue and manage to submit enough tasks to fill the queue at once.
That will cause rejections for top-level searches that should not be rejected.

This commit introduces wrapping for the executor to limit the number of tasks
that a single search instance can submit to the executor, to prevent the situation
where a single search submits way more tasks than threads available.

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2024-10-31 07:03:30 +11:00
Nhat Nguyen
128ba05192
Prohibit changes to index mode, source, and sort settings during restore (#115811) (#115972)
The index.mode, source.mode, and index.sort.* settings cannot be 
modified during restore, as this may lead to data corruption or issues
retrieving _source. This change enforces a restriction on modifying 
these settings during restore. While a fine-grained check could permit
equivalent settings, it seems simpler and safer to reject restore
requests if any of these settings are specified.
2024-10-31 05:47:54 +11:00
Simon Cooper
0bd7fa4135
[8.16] Fix NodeStatsTests chunking (#115929) (#115956)
* Fix NodeStatsTests chunking (#115929)

Rewrite the test to make it a bit clearer

* ScriptStats hasn't been converted
2024-10-31 04:51:39 +11:00
Nikolaj Volgushev
add5b2751a
[8.16] Add ECK Role Mapping Cleanup (115823) (#115871)
* Merge

* Fix merge

* Versions

* Nit

---------

Co-authored-by: Johannes Fredén <109296772+jfreden@users.noreply.github.com>
2024-10-30 22:59:57 +11:00
Kostas Krikellas
406f65fb31
[8.16] Use flattened names in ignored source (#115822) (#115899)
* Use flattened names in ignored source (#115822)

* Use flattened names in ignored source

* spotless

* fix rest compat

* fix unittests

* expand dots

(cherry picked from commit 06eb0727c2)

# Conflicts:
#	rest-api-spec/build.gradle

* Update 20_synthetic_source.yml

* Update 21_synthetic_source_stored.yml
2024-10-30 09:26:56 +02:00
Simon Cooper
78ab45523e
Make some chunked xcontent more efficient (#115512) (#115735) 2024-10-28 21:54:15 +11:00
Craig Taverner
ce820e559c
Slightly more generous assertions for Cartesian tests (#115658) (#115669) 2024-10-26 02:42:33 +11:00
Pawan Kartik
dec43e3f94
Correctly update search status for a nonexistent local index (#115138) (#115612)
* fix: correctly update search status for a nonexistent local index

* Check for cluster existence before updation

* Remove unnecessary `println`

* Address review comment: add an explanatory code comment

* Further clarify code comment

(cherry picked from commit ad9c5a0a06)
2024-10-25 15:09:54 +01:00
Ryan Ernst
124923b383
Guard blob store local directory creation with doPrivileged (#115459) (#115569)
The blob store may be triggered to create a local directory while in a
reduced privilege context. This commit guards the creation of
directories with doPrivileged.
2024-10-25 03:49:55 +11:00
Alexey Ivanov
c6b70f04bd
Report JVM stats for all memory pools (97046) (#115117) (#115549)
This fix allows reporting of all JVM memory pools sizes in JVM stats
2024-10-25 02:46:20 +11:00
Andrei Dan
e8877d686f
[8.16] Allow for queries on _tier to skip shards during coordinator rewrite (#114990) (#115513)
* Allow for queries on _tier to skip shards during coordinator rewrite (#114990)

The `_tier` metadata field was not used on the  coordinator when
rewriting queries in order to exclude shards that don't match. This lead
to queries in the following form to continue to report failures even
though the only unavailable shards were in the  tier that was excluded
from search (frozen tier in this example):

```
POST testing/_search
{
  "query": {
    "bool": {
      "must_not": [
        {
          "term": {
            "_tier": "data_frozen"
          }
        }
      ]
    }
  }
}
```

This PR addresses this by having the queries that can execute on `_tier`
(term, match, query string, simple query string, prefix, wildcard)
execute a coordinator rewrite to  exclude the indices that don't match
the `_tier` query  **before** attempting to reach to the shards (shards,
that might not be available and raise errors). 

Fixes #114910

* Don't use getFirst

* Test compile
2024-10-24 23:43:28 +11:00