Commit graph

112 commits

Author SHA1 Message Date
Rene Groeschke
457296f31f
Fix :plugins:repository-hdfs:forbiddenApisJavaRestTest (#102983) (#105921)
Reworking forbiddenApis check to use gradle worker api exposed a bug in
how we resolve krb5kdc keytab information. This fixes the depenendency to krb5kdc keytab configuration and
its builtBy task.

This also changes the usage of krb5kdc keytab files to be passed directly to task classpath as
they are only required at runtime and directly having them as part of javaRestTestRuntimeOnly would mean precommit
requires krb5kdc compose up which we definitely not want

(cherry picked from commit ab0bb4889a)
2024-03-05 11:16:20 +01:00
Rene Groeschke
e5116dc68e
[7.17] Cleanup repository-hdfs project (#84486) (#84639)
- use separate sourceSets for different type of tests
- remove usage of RestBuildPlugin
- Make fixture dependencies gradle idiomatic
2022-03-04 11:07:12 +01:00
Rene Groeschke
b6f463ff29
Fix stopping of old elasticsearch cluster (#81059) (#81139)
due to not exposing the PID of the underlaying cluster the Fixture Stop task
was skipped, leaving running clusters behind after the build finished
2021-11-30 04:30:53 -05:00
Mark Vieira
bcfbf00074 Reformat Elasticsearch source 2021-10-27 15:23:15 -07:00
Keith Massey
ca97b68f14
Changing test keytab to use aes256-cts-hmac-sha1-96 instead of des3-cbc-sha1-kd (#78703) (#79874)
The des3-cbc-sha1-kd encryption type is deprecated and no longer supported by newer jvm, causing tests
that use the krb5kdc-fixture to fail. This commit changes the encryption type of the test keytab to
aes256-cts-hmac-sha1-96.
Relates #78423 #78703
2021-10-26 17:47:41 -05:00
Albert Zaharovits
b0a5cdfb07
TEST Ensure password 14 chars length on Kerberos FIPS tests (#79496) (#79510) 2021-10-19 16:01:41 -04:00
Keith Massey
615ffa08c8
Upgrade repository-hdfs to Hadoop 3 (#78407)
This upgrades the repository-hdfs plugin to hadoop 3. Tests are performed against both hadoop 2 and hadoop 3 HDFS. The advantages of using the hadoop 3 client are:
Over-the-wire encryption works (tests coming in an upcoming PR).
We don't have to add (or ask customers to add) additional jvm permissions to the elasticsearch jvm
It's compatible with java versions higher than java 8
Relates #76897
2021-10-06 14:50:00 -05:00
Francisco Fernández Castaño
987f4991aa
Add third party integration tests for snapshot based recoveries (#76500)
This commit adds third party integration tests for snapshot based
recoveries in S3, Azure and GCS.

Relates #73496
Backport of #76489
2021-08-13 17:19:11 +02:00
Armin Braun
01ab0d99cb
Upgrade GCS SDK to 1.117.1 (#75290)
We're behind the upgrade schedule by quite a bit here, upgrading to the latest version
and adjusting our test fixture accordingly.
2021-07-13 14:33:42 +02:00
Armin Braun
ffeafae054
Save Memory on Large Repository Metadata Blob Writes (#74693)
This PR adds a new API for doing streaming serialization writes to a repository to enable repository metadata of arbitrary size and at bounded memory during writing.
The existing write-APIs require knowledge of the eventual blob size beforehand. This forced us to materialize the serialized blob in memory before writing, costing a lot of memory in case of e.g. very large RepositoryData (and limiting us to 2G max blob size).
With this PR the requirement to fully materialize the serialized metadata goes away and the memory overhead becomes completely bounded by the outbound buffer size of the repository implementation.

As we move to larger repositories this makes master node stability a lot more predictable since writing out RepositoryData does not take as much memory any longer (same applies to shard level metadata), enables aggregating multiple metadata blobs into a single larger blobs without massive overhead and removes the 2G size limit on RepositoryData.

backport of #74313 and #74620
2021-06-29 16:34:30 +02:00
Rory Hunter
fb8f84fdae Order imports when reformatting (#74059)
Change the formatter config to sort / order imports, and reformat the
codebase. We already had a config file for Eclipse users, so Spotless now
uses that.

The "Eclipse Code Formatter" plugin ought to be able to use this file as
well for import ordering, but in my experiments the results were poor.
Instead, use IntelliJ's `.editorconfig` support to configure import
ordering.

I've also added a config file for the formatter plugin.

Other changes:
   * I've quietly enabled the `toggleOnOff` option for Spotless. It was
     already possible to disable formatting for sections using the markers
     for docs snippets, so enabling this option just accepts this reality
     and makes it possible via `formatter:off` and `formatter:on` without
     the restrictions around line length. It should still only be used as
     a very last resort and with good reason.
   * I've removed mention of the `paddedCell` option from the contributing
     guide, since I haven't had to use that option for a very long time. I
     moved the docs to the spotless config.
2021-06-16 09:25:55 +01:00
Tanguy Leroux
bef1e45add
Apply spotless formatting to more sub-projects (#73989) (#73996) 2021-06-10 13:21:11 +02:00
Ryan Ernst
393ab2d813
Rename o.e.common in libs/core to o.e.core (#73909) (#73920)
When libs/core was created, several classes were moved from server's
o.e.common package, but they were not moved to a new package. Split
packages need to go away long term, so that Elasticsearch can even think
about modularization. This commit moves all the classes under o.e.common
in core to o.e.core.

relates #73784
backport #73909
2021-06-08 14:17:44 -07:00
Armin Braun
cf6099a942
Dry up Hashing BytesReference (#72443) (#72785)
Dries up the efficient way to hash a bytes reference and makes use
of it in a few other spots that were needlessly copying all bytes in
the bytes reference for hashing.
2021-05-06 07:50:07 +02:00
David Turner
16ad6bb720 Fix S3HttpHandler chunked-encoding handling (#72378)
The `S3HttpHandler` reads the contents of the uploaded blob, but if the
upload used chunked encoding then the reader would skip one or more
`\r\n` sequences if they appeared at the start of a chunk.

This commit reworks the reader to be stricter about its interpretation
of chunks, and removes some indirection via streams since we can work
pretty much entirely on the underlying `BytesReference` instead.

Closes #72358
2021-04-28 15:14:13 +01:00
David Turner
6b82c43590 Remove spurious docker volume from S3 fixture (#72388) 2021-04-28 15:11:53 +01:00
Armin Braun
01fecdf9a4
Ensure GCS Repository Metadata Blob Writes are Atomic (#72051) (#72070)
In the corner case of uploading a large (>5MB) metadata blob we did not set content validation
requirement on the upload request (we automatically have it for smaller requests that are not resumable
uploads). This change sets the relevant request option to enforce a MD5 hash check when writing
`BytesReference` to GCS (as is the case with all but data blob writes)

closes #72018
2021-04-22 12:40:20 +02:00
Przemko Robakowski
f14438757b
[7.x] Add GeoIP CLI integration test (#71381) (#71465)
* Add GeoIP CLI integration test (#71381)

This change adds additional test to GeoIpDownloaderIT which tests that artifacts produces by GeoIP CLI tool can be consumed by cluster the same way as from our original service.
It does so by running the tool from fixture which then simply serves the generated files (this is exactly the way users are supposed to use the tool as well).

Relates to #68920
# Conflicts:
#	test/fixtures/geoip-fixture/src/main/java/fixture/geoip/GeoIpHttpFixture.java

* fix compilation
2021-04-08 18:05:59 +02:00
Yannick Welsch
9d30ca419f Use default application credentials for GCS repositories (#71239)
Adds support for "Default Application Credentials" for GCS repositories, making it easier to set up a repository on GCP,
as all relevant information to connect to the repository is retrieved from the environment, not necessitating complicated
keystore setups.
2021-04-06 15:21:31 +02:00
Przemko Robakowski
f814526110
[7.x] Add tool for preparing local GeoIp database service (#71018) (#71106)
* Add tool for preparing local GeoIp database service (#71018)

Air-gapped environments can't simply use GeoIp database service provided by Infra, so they have to either use proxy or recreate similar service themselves.
This PR adds tool to make this process easier. Basic workflow is:

download databases from MaxMind site to single directory (either .mmdb files or gzipped tarballs with .tgz suffix)
run the tool with $ES_PATH/bin/elasticsearch-geoip -s directory/to/use [-t target/directory]
serve static files from that directory (for example with docker run -v directory/to/use:/usr/share/nginx/html:ro nginx
use server above as endpoint for GeoIpDownloader (geoip.downloader.endpoint setting)
to update new databases simply put new files in directory and run the tool again
This change also adds support for relative paths in overview json because the cli tool doesn't know about the address it would be served under.

Relates to #68920

* compilation fix

* spotless
2021-03-31 14:35:48 +02:00
Przemko Robakowski
d10b156a3d
[7.x] Add support for .tgz files in GeoIpDownloader (#70725) (#70976)
* Add support for .tgz files in GeoIpDownloader (#70725)

We have to ship COPYRIGHT.txt and LICENSE.txt files alongside .mmdb files for legal compliance. Infra will pack these in single .tgz (gzipped tar) archive provided by GeoIP databases service.
This change adds support for that format to GeoIpDownloader and DatabaseRegistry
2021-03-30 01:17:57 +02:00
Francisco Fernández Castaño
308cdad059
Add searchable snapshots integration tests for URL repositories (#70917)
Relates #69521
Backport of #70709
2021-03-26 17:24:00 +01:00
Przemko Robakowski
827a70c4a4
Update GeoIP database service URL (#69862) (#69867)
This change updates GeoIP database service URL to the new https://geoip.elastic.co/v1/database and removes (now optional) key/UUID parameter.
It also fixes geoip-fixture to provide 3 different test databases (City, Country and ASN).
It also unmutes GeoIpDownloaderIT. testGeoIpDatabasesDownload with additional logging and increased timeouts which tries to address #69594
2021-03-03 15:09:57 +01:00
Francisco Fernández Castaño
1db660aa56
Add integration tests for repository analyser test kit (#69780)
Relates #67247
Backport of #69316
2021-03-02 12:38:37 +01:00
Mark Vieira
b3a6ae1e4c Update Docker image used by minio test fixture to support Arm (#69743)
(cherry picked from commit 3144354826)
2021-03-01 15:17:35 -08:00
Mark Vieira
c0fc69dbdc Update test fixture to avoid writing to /etc/hosts file (#69583) 2021-02-25 09:59:21 -08:00
Przemko Robakowski
044085d23b
Add ToS query parameter to GeoIP downloader (#69495) (#69520)
This change adds query parameter confirming that we accept ToS of GeoIP database service provided by Infra.
It also changes integration test to use lower timeout when using local fixture.

Relates to #68920
2021-02-24 12:01:58 +01:00
Przemko Robakowski
048d67e867
[7.x] GeoIP database downloader (#68424) (#69481)
This change adds component that will download new GeoIP databases from infra service
New databases are downloaded in chunks and stored in .geoip_databases index
Downloads are verified against MD5 checksum provided by the server
Current state of all stored databases is stored in cluster state in persistent task state

Relates to #68920
2021-02-24 09:11:39 +01:00
David Turner
e7c0836f7e
Adjust encoding of Azure block IDs (#68980)
Today we represent block IDs sent to Azure using the URL-safe base-64
encoding. This makes sense: these IDs appear in URLs. It turns out that
Azure rejects this encoding for block IDs and instead demands that they
are represented using the regular, URL-unsafe, base-64 encoding instead,
then further wrapped in %-encoding to deal with the URL-unsafe
characters that inevitably result.

Relates #66489
Backport of #68957
2021-02-15 12:12:53 +00:00
Mark Vieira
2d1e8b3abd Update sources with new SSPL+Elastic-2.0 license headers
As per the new licensing change for Elasticsearch and Kibana this commit
moves existing Apache 2.0 licensed source code to the new dual license
SSPL+Elastic license 2.0. In addition, existing x-pack code now uses
the new version 2.0 of the Elastic license. Full changes include:

- Updating LICENSE and NOTICE files throughout the code base, as well
  as those packaged in our published artifacts
- Update IDE integration to now use the new license header on newly
  created source files
- Remove references to the "OSS" distribution from our documentation
- Update build time verification checks to no longer allow Apache 2.0
  license header in Elasticsearch source code
- Replace all existing Apache 2.0 license headers for non-xpack code
  with updated header (vendored code with Apache 2.0 headers obviously
  remains the same).
- Replace all Elastic license 1.0 headers with new 2.0 header in xpack.
2021-02-02 18:07:23 -08:00
Francisco Fernández Castaño
2050c1e4a5
Avoid early task cancellation during azure parallel blob deletions (#66989)
Closes #66633
Backport of #66929
2021-01-05 13:43:23 +01:00
Rene Groeschke
1b37d40984
Port all task definitions to task avoidance api (#66738) (#66927)
This finishes porting all tasks created in gradle build scripts and plugins to use
the task avoidance api (see #56610)

* Port all task definitions to task avoidance api
* Fix last task created during configuration
* Fix test setup in  :modules:reindex
* Declare proper task inputs
2021-01-04 15:54:16 +01:00
Ioannis Kakavas
c0b24df307
Ensure CI is run in FIPS 140 approved only mode (#66804)
We were depending on the BouncyCastle FIPS own mechanics to set
itself in approved only mode since we run with the Security
Manager enabled. The check during startup seems to happen before we
set our restrictive SecurityManager though in
org.elasticsearch.bootstrap.Elasticsearch , and this means that
BCFIPS would not be in approved only mode, unless explicitly
configured so.

This commit sets the appropriate JVM property to explicitly set
BCFIPS in approved only mode in CI and adds tests to ensure that we
will be running with BCFIPS in approved only mode when we expect to.
It also sets xpack.security.fips_mode.enabled to true for all test clusters
used in fips mode and sets the distribution to the default one. It adds a
password to the elasticsearch keystore for all test clusters that run in fips
mode.
Moreover, it changes a few unit tests where we would use bcrypt even in
FIPS 140 mode. These would still pass since we are bundling our own
bcrypt implementation, but are now changed to use FIPS 140 approved
algorithms instead for better coverage.

It also addresses a number of tests that would fail in approved only mode
Mainly:

    Tests that use PBKDF2 with a password less than 112 bits (14char). We
    elected to change the passwords used everywhere to be at least 14
    characters long instead of mandating
    the use of pbkdf2_stretch because both pbkdf2 and
    pbkdf2_stretch are supported and allowed in fips mode and it makes sense
    to test with both. We could possibly figure out the password algorithm used
    for each test and adjust password length accordingly only for pbkdf2 but
    there is little value in that. It's good practice to use strong passwords so if
    our docs and tests use longer passwords, then it's for the best. The approach
    is brittle as there is no guarantee that the next test that will be added won't
    use a short password, so we add some testing documentation too.
    This leaves us with a possible coverage gap since we do support passwords
    as short as 6 characters but we only test with > 14 chars but the
    validation itself was not tested even before. Tests can be added in a followup,
    outside of fips related context.

    Tests that use a PKCS12 keystore and were not already muted.

    Tests that depend on running test clusters with a basic license or
    using the OSS distribution as FIPS 140 support is not available in
    neither of these.

Finally, it adds some information around FIPS 140 testing in our testing
documentation reference so that developers can hopefully keep in
mind fips 140 related intricacies when writing/changing docs.
2020-12-24 15:35:28 +02:00
Rene Groeschke
195907bf84
Allow dynamic port allocation for hdfs fixture (#66440) (#66445)
Running multiple hdfs fixtures in parallel for running integration tests requires
a dynamic port assignment in order to avoid port clashes. This introduces
the ability to assign port ranges to gradle projects that can be used
to allocate dynamically ports used by these projects.

We apply this dynamic port setup for hdfs fixtures used in
:x-pack:plugin:searchable-snapshots:qa only at the moment as
tests sources (rest tests) in :plugins:repository-hdfs still rely on
hard coded ports.

This is a simplified version of fixtures I created before on the gradle codebase
to deal with similar issues.

Fixes #66377
2020-12-16 16:47:16 +01:00
Francisco Fernández Castaño
f04a74bb75
[7.x] Upgrade Azure repository SDK to v12 (#66333)
Upgrade Azure repository to the latest non blocking Azure SDK.

Closes https://github.com/elastic/elasticsearch/issues/43309

Backport of #65140

Co-authored-by: Ryan Ernst <ryan@iernst.net>
2020-12-15 16:21:54 +01:00
Rene Groeschke
68fce39562
Avoid tasks materialized during configuration phase (#65922) (#66218)
* Avoid tasks materialized during configuration phase
* Fix RestTestFromSnippet testRoot setup
2020-12-12 22:13:38 +01:00
Francisco Fernández Castaño
2bb5716b3d
Add repositories metering API (#62088)
This pull request adds a new set of APIs that allows tracking the number of requests performed
by the different registered repositories.

In order to avoid losing data, the repository statistics are archived after the repository is closed for
a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the
statistics for the active repositories as well as the modified/closed repositories.

Backport of #60371
2020-09-08 14:01:04 +02:00
Ryan Ernst
6d3b691048
Add snapshot only test modules (#61954)
This commit adds external test modules. These are modules meant for
external systems to test edge cases in elasticsearch, but only within
snapshots. They are not meant to be used in production, so protections
are also added from their accidental inclusion in release builds.

Note that this commit does not actually add any new modules, it only
adds the infrastructure for the new modules, under
`test/external-modules`.
2020-09-04 16:35:18 -07:00
Jay Modi
f0128ae074
Canonicalize client name in krb5kdc-fixture (#61119)
This commit changes the value for client name canonicalization to true
in the krb5.conf template file. This is done as a means to workaround
JDK-8246193 which has made it into some builds of JDK8.

Closes #61050
2020-08-13 14:58:08 -06:00
Armin Braun
3e2dfc6eac
Remove GCS Bucket Exists Check (#60899) (#60914)
Same as https://github.com/elastic/elasticsearch/pull/43288 for GCS.
We don't need to do the bucket exists check before using the repo, that just needlessly
increases the necessary permissions for using the GCS repository.
2020-08-11 09:54:27 +02:00
Mark Vieira
dc7d4c615c
Ensure fixture runtime dependencies are built before starting containers (#59474) 2020-07-13 15:58:01 -07:00
Armin Braun
9268b25789
Add Check for Metadata Existence in BlobStoreRepository (#59141) (#59216)
In order to ensure that we do not write a broken piece of `RepositoryData`
because the phyiscal repository generation was moved ahead more than one step
by erroneous concurrent writing to a repository we must check whether or not
the current assumed repository generation exists in the repository physically.
Without this check we run the risk of writing on top of stale cached repository data.

Relates #56911
2020-07-08 14:25:01 +02:00
Rene Groeschke
d952b101e6
Replace compile configuration usage with api (7.x backport) (#58721)
* Replace compile configuration usage with api (#58451)

- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
  - required as java library will by default not have build jar file
  - jar file is now explicit input of the task and gradle will ensure its properly build

* Fix compile usages in 7.x branch
2020-06-30 15:57:41 +02:00
Armin Braun
be6fa72432
Fix GCS Mock Behavior for Missing Bucket (#57283) (#57310)
* Fix GCS Mock Behavior for Missing Bucket

We were throwing a 500 instead of a 404 for a missing bucket.
This would make yaml tests needlessly wait for multiple seconds, retrying
the 500 response with backoff, in the test checking behavior for missing buckets.
2020-05-29 10:01:20 +02:00
Armin Braun
a4eb3edf46
Fix GCS Repository YAML Test Build (#57073) (#57101)
A few relatively obvious issues here:

* We cannot run the different IT runs (large blob setting one and normal integ run) concurrently
* We need to set the dependency tasks up correctly for the large blob run so that it works in isolation
* We can't use the `localAddress` for the location header of the resumable upload
(this breaks in YAML tests because GCS is using a loopback port forward for the initial request and the
local address will be chosen as the actual Docker container host)

Closes #57026
2020-05-25 11:10:39 +02:00
Armin Braun
0a879b95d1
Save Bounds Checks in BytesReference (#56577) (#56621)
Two spots that allow for some optimization:

* We are often creating a composite reference of just a single item in
the transport layer => special cased via static constructor to make sure we never do that
   * Also removed the pointless case of an empty composite bytes ref
* `ByteBufferReference` is practically always created from a heap buffer these days so there
is no point of dealing with all the bounds checks and extra references to sliced buffers from that
and we can just use the underlying array directly
2020-05-12 20:33:45 +02:00
Tanguy Leroux
35622747fd
Add Minio tests for searchable snapshots (#56112) (#56179)
This commit adds QA tests for searchable snapshot on MinIO,
similarly to what already exist for S3, GCS and Azure.
2020-05-05 11:40:06 +02:00
Yannick Welsch
ba39c261e8 Use streaming reads for GCS (#55506)
To read from GCS repositories we're currently using Google SDK's official BlobReadChannel,
which issues a new request every 2MB (default chunk size for BlobReadChannel) using range
requests, and fully downloads the chunk before exposing it to the returned InputStream. This
means that the SDK issues an awfully high number of requests to download large blobs.
Increasing the chunk size is not an option, as that will mean that an awfully high amount of
heap memory will be consumed by the download process.

The Google SDK does not provide the right abstractions for a streaming download. This PR
uses the lower-level primitives of the SDK to implement a streaming download, similar to what
S3's SDK does.

Also closes #55505
2020-04-21 13:22:26 +02:00
Yannick Welsch
b9da307cd1 Add GCS support for searchable snapshots (#55403)
Adds ranged read support for GCS repositories in order to enable searchable snapshot support
for GCS.

As part of this PR, I've extracted some of the test infrastructure to make sure that
GoogleCloudStorageBlobContainerRetriesTests and S3BlobContainerRetriesTests are covering
similar test (as I saw those diverging in what they cover)
2020-04-20 13:02:59 +02:00
Rory Hunter
a5b545b2a0
Use LTS version of Ubuntu in Dockerfiles (#55370)
We have some Dockerfiles that reference Ubuntu 19.04, which is not an LTS
version and has now appears to have been retired from the Ubuntu repositories.
Switch to 18.04, which is the current long-term support version. This
also requires a switch from OpenJDK 12 to 11.

Also change a usage of 16.04 to 18.04, for consistency.
2020-04-17 16:14:14 -04:00