Block specific config files from being accessed after startup (#107481)
Some files should never be accessed by ES or plugin code once startup has completed. Use the security manager to block these files from being accessed by anything at all. The current blocked files are elasticsearch.yml, jvm.options, and the jvm.options.d directory.
Reworking forbiddenApis check to use gradle worker api exposed a bug in
how we resolve krb5kdc keytab information. This fixes the depenendency to krb5kdc keytab configuration and
its builtBy task.
This also changes the usage of krb5kdc keytab files to be passed directly to task classpath as
they are only required at runtime and directly having them as part of javaRestTestRuntimeOnly would mean precommit
requires krb5kdc compose up which we definitely not want
(cherry picked from commit ab0bb4889a)
Today `TcpTransport#openConnection` may throw exceptions on certain
kinds of failure, but other kinds of failure are passed to the listener.
This is trappy and not all callers handle it correctly. This commit
makes sure that all exceptions are passed to the listener.
Closes#100510
* Update Gradle Wrapper to 8.2 (#96686)
- Convention usage has been deprecated and was fixed in our build files
- Fix test dependencies and deprecation
This is a backport of multiple work items related to authentication enhancements for HTTP,
which were originally merged in the 8.8 - 8.9 releases.
Hence, the HTTP (only the netty4-based implementation (default), not the NIO one) authentication
implementation gets a throughput boost (especially for requests failing authn).
Relates to: ES-6188 #92220#95112
The docs for this API say the following:
> If the API fails, you can safely retry it. Only a successful response
> guarantees that the node has been removed from the voting
> configuration and will not be reinstated.
Unfortunately this isn't true today: if the request adds no exclusions
then we do not wait before responding. This commit makes the API wait
until all exclusions are really applied.
Backport of #98386, plus the test changes from #98146 and #98356.
This API can be quite heavy in large clusters, and might spam the
`MANAGEMENT` threadpool queue with work for clients that have long-since
given up. This commit adds some basic cancellability checks to reduce
the problem.
Backport of #96551 to 7.17
This commit adds the ability to initialize YAML rest test suites against
a subset of available test cases. Previously, the only way to do this is
via the `tests.rest.suite` system property, but that can only be set at
the test _task_ level. Configuring this at the test _class_ level means
that we can support having multiple test suite classes that execute
subsets of tests within a project. That allows for things like
parallelization, or having different test cluster setups for different
YAML tests within the same project.
For example:
```java
@ParametersFactory
public static Iterable<Object[]> parameters() throws Exception {
return ESClientYamlSuiteTestCase.createParameters(new String[] { "analysis-common", "indices.analyze" });
}
```
The above example would mean that only tests in the `analysis-common`
and `indices.analyze` directories would be included in this suite.
cc @jdconrad
Closes#95089
HttpTracer logs a message when an HTTP response is sent to a HttpChannel
but that does not mean the response sending process is completed.
This pull request changes the HttpTracer so that the message is now
logged when the sending is complete.
Backport of #94436
In JDK 20 Thread suspend/resume is soft removed (they now throw
UnsupportedOperationException). Many ES disruption tests simulate GC
pauses with suspend/resume. As that strategy will no longer work, this
commit mutes those tests for jdk20+.
relates #94206closes#93707
This adds a `keystore` method to `LocalSpecBuilder` to provide secure
keystore settings via a `Supplier<String>` to allow for lazy-evaluated
settings, such as a secure setting that resolves to the address of
dependent cluster.
If we use a port range of up to `36600` this means that we have overlap
with Linux' default ephemeral port range starting at 32768. If we get
unlucky there and all ports in a range that overlaps with the ephemeral
ports we will be unable to bind to any port in a node's range and the
test will fail. If we limit the max private port below the ephemeral
range that won't happen at the cost of a slightly higher chance of
collisions due to worker id wraparound, which given that we're still at
600+ids until wrap-around seems rather unlikely while bind failures
have been observed in the real world.
closes#92477
Co-authored-by: Armin Braun <me@obrown.io>
* Use absolute paths for logs locations in JVM options for test clusters (#93672)
# Conflicts:
# test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/LocalClusterFactory.java
* Spotless
The issue with this test failure is actually that we were silently
failing to install the plugin under test into the cluster. The root
cause here was the FIPS security policy file was not copied into cluster
config directory before we attempting to run the plugin installer. Since
we pass the FIPS JVM arguments to all CLI tools as well this caused
plugin installation to fail. We now ensure that these files are copied
before we attempt to run _any_ ES tools.
Closes https://github.com/elastic/elasticsearch/issues/93303
* Migrate core rest tests with security to new testing framework (#92575)
# Conflicts:
# x-pack/qa/core-rest-tests-with-security/build.gradle
* Fixes
* More fixes
* More fixes
* More more fixes
* Simplify and optimize deduplication of RepositoryData for a non-caching repository instance (#91851)
This makes use of the new deduplicator infrastructure to move to more
efficient deduplication mechanics.
The existing solution hardly ever deduplicated because it would only
deduplicate after the repository entered a consistent state. The
adjusted solution is much simpler, in that it simply deduplicates such
that only a single loading of `RepositoryData` will ever happen at a
time, fixing memory issues from massively concurrent loading of the repo
data as described in #89952.
closes#89952
* fix compile
* Add support for addition configuration files to test clusters framework (#92579)
This adds the ability to supply arbitrary files to the config directory
of cluster nodes. Typically, this is used for security use cases, such
as providing for SSL certificates and trust stores.
This commit adds a few other features to enable more testing ues cases
as well, such as the ability to restart a cluster, as well as explicit
ordering of test cases withing a test class. This is needed for test
suites that need to execute some tests, restart the cluster, then
execute more in a particular order.
# Conflicts:
# test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/LocalClusterHandle.java
# x-pack/plugin/security/qa/basic-enable-security/build.gradle
# x-pack/plugin/security/qa/basic-enable-security/src/javaRestTest/java/org/elasticsearch/xpack/security/EnableSecurityOnBasicLicenseIT.java
# x-pack/qa/multi-node/src/javaRestTest/java/org/elasticsearch/multi_node/GlobalCheckpointSyncActionIT.java
* Fix static initialization of random value
* Remove unused imports
* Spotless
* Fixes for module projects in new tests clusters and auto security config (#92533)
Fix an issue where the build cannot resolve a module dependency for the
current module project. Also add partial support for security auto-
configuration in test clusters.
# Conflicts:
# build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/RestTestBasePlugin.java
# modules/aggregations/build.gradle
# modules/aggs-matrix-stats/src/yamlRestTest/java/org/elasticsearch/search/aggregations/matrix/MatrixStatsClientYamlTestSuiteIT.java
# test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/LocalClusterHandle.java
* Post-merge fixes
* Spotless
This commit adds a new test framework for configuring and orchestrating
test clusters for both Java and YAML REST testing. This will eventually
replace the existing "test-clusters" Gradle plugin and the build-time
cluster orchestration.
This commit disables recovering from snapshots for searchable snapshots
as the snapshot for these type of indices consist in a pointer to the
original snapshot and it produced confusing error messages.
Backport of #86388
The following two failures happen rarely, but both fail in the same
`assertBusy` block. I don't have a clue why, and couldn't reproduce
them. Considering the amount of checks in that block, maybe a larger
timeout is more suitable. (Also it seems from the test history, it is
not uncommon for those tests to take 2-3s, so every few thousand runs
hitting the 10s timeout seems likely, IMO!) Relates
https://github.com/elastic/elasticsearch/issues/88884,
https://github.com/elastic/elasticsearch/issues/88201
- When a -Dlicense.key sys property is passed to the build we want to consider
this in the test reproduction info message
- Absolute Paths tried to be converted to relative paths relative to workspace
root to allow simply copy & paste
- Also fixes a inconsistency for checking license existence in x-pack plugin core build
The High Level REST Client provides a MountSnapshotRequest#ignoredIndexSettings(String[]) method to define some index settings to ignore when mounting a snapshot as a searchable snapshot index.
Sadly the client generates a wrong request body field ignored_index_settings instead of ignore_index_settings. This wasn't caught until #75982 was reported because the parser of Mount API request ignores unknown fields in request body.
We fixed the wrong leniency of the request parser on the Elasticsearch side (#88987) starting version 8.5.0, and we decided to continue to ignore the ignored_index_settings generated by the HLRC bug to avoid breaking HLRC client usages.
This change fixes the request body generated by the HLRC to pass the correct field name. This fix is for versions 7.17.6+ (we do not expect to release new versions of HLRC in 8.x).
It also renames the HLRC methods to expose the change to client users.
Co-authored-by: bellengao <bellengao@tencent.com>
Today nodes started in an `InternalTestCluster` use `transport.port: 0`
and `http.port: 0` which selects a port from the ephemeral range. This
range is also used by other tests, notably REST tests, and this can lead
to collisions and consequent failures when nodes restart.
This commit restricts the range of ports using the same algorithm as in
`ESTestCase`, avoiding[^1] such collisions.
[^1]: technically this isn't quite enough because the ephemeral range on
some CI workers overlaps the ranges chosen by `ESTestCase`, but that's a
separate issue tracked in #87734Closes#87448
FVH was relying on `SourceLookup.extractRawValues()` to load its data, but this no
longer works for multifields. It should instead use value fetchers which will correctly
locate the input for multifields and/or copy fields.
Fixes#84690Fixes#82458Fixes#80895Fixes#75011
The vast majority of our highlighter tests are integration or rest tests, which exercise
the full ES stack but take a long time to run and are difficult to debug. We have a few
unit tests but they are testing very low-level behaviour, and don't interact with the fetch
phase or hit contexts. This commit adds a new HighlighterTestCase base class with some
helper methods that should fill the gap between these two sets of tests. It includes a
method that takes a MapperService, ParsedDocument and SearchSourceBuilder, and then
runs the appropriate highlighter fetch subphase over the resulting hit.
Today if the submission within `ThreadedActionListener#onResponse` is
rejected from its threadpool then we call `delegate#onFailure` with the
rejection exception on the calling thread. However, if the submission
within `ThreadedActionListener#onFailure` is rejected then we just drop
the listener and log an error.
In most cases completing a listener exceptionally triggers some cleanup
which is often fairly lightweight and therefore safe enough to complete
on the calling thread. In any case it's generally preferable to complete
a listener exceptionally on the wrong thread rather than just dropping
it entirely.
This commit fixes this and adds a test to verify that
`ThreadedActionListener` completes properly even in the face of
rejections.
The SystemIndices constructor should take a list instead of a map as an
argument so that we can guarantee that the map we use for feature lookups is
keyed on the feature name.
We also provide some new getter methods so that calling code does not have to
handle the map directly.