Commit graph

61144 commits

Author SHA1 Message Date
Brian Seeders
945a13df29
[buildkite] Do collapsing annotations for Terrazzo pipelines as well (#101241)
(cherry picked from commit 24ef517355)
2023-10-23 16:21:21 -04:00
Brian Seeders
5e7ca59572
[buildkite] Increase release-tests timeout (#101238)
(cherry picked from commit 4d10ea1849)
2023-10-23 16:06:36 -04:00
Brian Seeders
53c09f49bf
Disable packaging tests in Jenkins (#101088) (#101223)
(cherry picked from commit 5d14bca37d)
2023-10-23 11:36:27 -04:00
Iraklis Psaroudakis
cadd3d3a0d
[7.17] Remove translog from bwc testRecovery (#101068) (#101104)
* Remove translog from bwc testRecovery (#101068)

When the test was trying to test recovering translog ops,
since we flush on close/shutdown, it failed because it never
recovered any translog ops.

The code for translog recovery is irrelevant due to that and
this PR proposes to remove it.

Alternatively, we could simulate killing nodes forcibly before
upgrading, but (a) that seems out of the ordinary for upgrades,
and (b) in trying that, it did not consistently pass the test
because sometimes the flush on close still happened.

Fixes #52031

* Fix
2023-10-19 05:47:09 -04:00
Brian Seeders
92682277ba
[buildkite] Remove idp-fixture docker-compose wait and bump check task agent memory (#101059) (#101075) 2023-10-18 12:04:18 -04:00
Mary Gouseti
b249462795
[7.17] WaitForSnapshotStep verifies if the index belongs to the latest snapshot of that SLM policy (#100911) (#101030)
* `WaitForSnapshotStep` verifies if the index belongs to the latest snapshot of that SLM policy (#100911)

The `WaitForSnapshotStep` used to check if the SLM policy has been
executed after the index has entered the delete phase, but it did not
check if the SLM policy included this index.

The result of this is that if the user used an SLM policy that did not
include this index, when the index would enter the
`WaitForSnapshotStep`, it would wait for a snapshot to be taken, a
snapshot that would not include the index, and then ILM would delete the
index.

See the exact reproduction path:
https://github.com/elastic/elasticsearch/issues/57809

**Solution** This PR, after it finds a successful SLM run, it verifies
if the snapshot taken by SLM contains this index. If not it throws an
error, otherwise it proceeds.

ILM explain will report:

```
"step_info": {
        "type": "illegal_state_exception",
        "reason": "the last successful snapshot of policy 'hourly-snapshots' does not include index '.ds-my-other-stream-2023.10.16-000001'"
      }
```

**Backwards compatibility concerns** In this PR, the
`WaitForSnapshotStep` changed from `ClusterStateWaitStep` to
`AsyncWaitStep`. We do not think this is gonna cause an issue. This was
tested manually by the following steps: - Run a master node with the old
version. - When ILM is executing `wait-for-snapshot`, we shutdown the
node - We start the node again with the new version os ES - ES was able
to pick up the step and continue with the new code.

We believe that this covers bwc concerns.

Fixes: https://github.com/elastic/elasticsearch/issues/57809
(cherry picked from commit 5697fcf594)
2023-10-18 12:33:22 +03:00
Mark Vieira
28c0147a06
Update task name on ARM platform support CI job 2023-10-17 15:21:12 -07:00
Brian Seeders
23db6a84f0
[CI] Disable jenkins platform-support jobs, and re-enable all Buildkite periodic pipelines (#100630) (#101004)
(cherry picked from commit b280a63eb7)
2023-10-17 13:17:00 -04:00
Rene Groeschke
468bef1b9e
[7.17] Update gradle wrapper to 8.4 (#99856) (#100926)
* Remove deprecated forConfigurationTime usage
2023-10-17 13:44:48 +02:00
Ignacio Vera
56f8e477a7
Add tolerance to ExtendedStatsAggregatorTests#testSummationAccuracy (#100917) (#100939) 2023-10-17 02:47:25 -04:00
Dianna Hohensee
5a8b6fc972
[7.17] Stabilize testRerouteRecovery throttle testing (#100788) (#100858)
Refactor testRerouteRecovery, pulling out testing of shard recovery
throttling into separate targeted tests. Now there are two additional
tests, one testing source node throttling, and another testing target
node throttling. Throttling both nodes at once leads to primarily the
source node registering throttling, while the target node mostly has
no cause to instigate throttling.

(cherry picked from commit 323d9366df)
2023-10-16 09:03:15 -04:00
Rene Groeschke
a080bb2bbe
[7.17] Update gradle wrapper to 8.3 (#97838) (#100715)
* Update gradle wrapper to 8.3 (#97838)

Gradle now fully supports compiling, testing and running on Java 20.
Among other general performance improvements this release introduces --test-dry-run command line option that allows checking if tests are filtered or not by gradle.
Required updating nebula ospackage plugin as setuid was broken in gradle 8.3.

(cherry picked from commit b23e000c30)

# Conflicts:
#	build-tools-internal/src/integTest/groovy/org/elasticsearch/gradle/internal/test/rest/LegacyYamlRestCompatTestPluginFuncTest.groovy
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/ElasticsearchJavaModulePathPlugin.java
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/test/rest/compat/compat/AbstractYamlRestCompatTestPlugin.java
#	build-tools-internal/src/main/resources/minimumGradleVersion
#	gradle/verification-metadata.xml
#	gradle/wrapper/gradle-wrapper.jar
#	gradlew
#	x-pack/plugin/watcher/qa/with-monitoring/src/javaRestTest/java/org/elasticsearch/smoketest/MonitoringWithWatcherRestIT.java

* [7.17] Use patched nebula os package gradle plugin

* Update testingconvention precommit integ test
2023-10-16 06:18:08 -04:00
Mark Vieira
9f8f12d395
Disable BWC tests in encryption at rest CI job (#100784) (#100787) 2023-10-12 20:02:06 -04:00
Yang Wang
9e7713a866
[7.17] Log a debug level message for deleting non-existing snapshot (#100479) (#100509)
* Log a debug level message for deleting non-existing snapshot (#100479)

The new message helps pairing with the "deleting snapshots" log message
at info level.

(cherry picked from commit 2cfdb7a92d)

# Conflicts:
#	server/src/main/java/org/elasticsearch/snapshots/SnapshotsService.java

* spotless

* compilation

---------

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2023-10-11 19:43:42 -04:00
Brian Seeders
0c431cecfb
[buildkite] Pin version of bun being used due to bug in latest (1.0.5) (#100720) (#100723)
(cherry picked from commit 00c54aea90)
2023-10-11 15:09:09 -04:00
Brian Seeders
1500e06da0
[buildkite] Add more logging and debug information to PR pipeline generation (#100709) (#100712)
(cherry picked from commit 81a441e636)
2023-10-11 14:15:17 -04:00
Mark Vieira
69e8625dcf
Exclude BWC tests in platform support testing matrix (#100643) (#100702)
# Conflicts:
#	.buildkite/pipelines/periodic-platform-support.yml
2023-10-11 12:56:47 -04:00
Mark Vieira
ac4022c729
Capture JVM crash dump logs in uploaded artifact bundle (#100627) (#100632) 2023-10-10 15:24:26 -04:00
Jason Bryan
c92d12d374
Prune changelogs after 7.17.14 release 2023-10-10 12:52:48 -04:00
Jason Bryan
d92ea26308
Bump versions after 7.17.14 release 2023-10-10 12:52:02 -04:00
Brandon Morelli
202296a70f
Update 7.17.14.asciidoc (#100591) 2023-10-10 06:07:51 -07:00
Brian Seeders
7ef9572438
Add healthcheck for shibboleth-idp in idp-fixture (again) (#100461) (#100525)
(cherry picked from commit d0c263bfa6)

# Conflicts:
#	x-pack/test/idp-fixture/build.gradle
2023-10-09 16:53:43 -04:00
Jason Bryan
add28eb04c
Update docs for v7.17.14 release (#100462) 2023-10-09 14:42:41 -04:00
Ed Savage
ccb5c4d0da
Mute failing NodeConnectionsServiceTests/testEventuallyConnectsOnlyToAppliedNodes (#100495)
Test
`NodeConnectionsServiceTests/testEventuallyConnectsOnlyToAppliedNodes`
fails with

```
java.lang.AssertionError: not connected to {node_21}{21}{Smg5SSzlSAWdhBrP63KjTQ}{0.0.0.0}{0.0.0.0:7}{dmsw}

  at __randomizedtesting.SeedInfo.seed([963D3EFA5F943C6E:6D1F74C5990D6DDD]:0)
  at org.junit.Assert.fail(Assert.java:88)
  at org.junit.Assert.assertTrue(Assert.java:41)
  at org.elasticsearch.cluster.NodeConnectionsServiceTests.assertConnected(NodeConnectionsServiceTests.java:507)
  at org.elasticsearch.cluster.NodeConnectionsServiceTests.assertConnectedExactlyToNodes(NodeConnectionsServiceTests.java:501)
  at org.elasticsearch.cluster.NodeConnectionsServiceTests.assertConnectedExactlyToNodes(NodeConnectionsServiceTests.java:497)
  at org.elasticsearch.cluster.NodeConnectionsServiceTests.lambda$testEventuallyConnectsOnlyToAppliedNodes$6(NodeConnectionsServiceTests.java:152)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1143)
  at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1116)
```

Mute it.

Relates #100493
2023-10-09 06:49:23 -04:00
Mark Vieira
80f31138bc
Add pull request check for validating changelogs (#100449) 2023-10-06 16:11:49 -07:00
Brian Seeders
71f5f2daa8
[buildkite] Fix backport PR pipeline generation (#100427) (#100430)
(cherry picked from commit f4d53bcc61)

# Conflicts:
#	.buildkite/scripts/pull-request/pipeline.ts
2023-10-06 11:56:25 -04:00
Mark Vieira
774e3bfa4d
Ensure all docker compose tasks are disabled when appropriate (#100355)
We mistakenly weren't skipping `DockerBuild` tasks when docker is
unavailable on the host machine.
2023-10-05 15:28:57 -04:00
Ed Savage
baad6f6e6a
[7.17][ML] defend against negative datafeed start times (#100332)
* [ML] Defend against negative datafeed start times (#100284)

A negative start time in the datafeed can cause significant disruption
to an entire cluster. This PR checks that the start time is greater
than or equal to 0 and throws an exception otherwise.

* Adjust backported test for 7.17
2023-10-05 08:03:00 -04:00
Mark Vieira
1e8696f2b2
Never support Docker build features on Windows (#100218) (#100225) 2023-10-03 15:40:00 -04:00
Joe Gallo
2f8fa89fe3
Refactor WriteableIngestDocument (#99324) (#100224) 2023-10-03 15:32:07 -04:00
Mark Vieira
f1f2e8d289
Revert back to using docker compose v1 CLI for test fixtures (#100206) (#100216)
# Conflicts:
#	build-tools-internal/src/main/java/org/elasticsearch/gradle/internal/testfixtures/TestFixturesPlugin.java
2023-10-03 14:20:49 -04:00
Mark Vieira
3dbba882f4
Mute IndexRecoveryIT.testRerouteRecovery (#100209) (#100211)
Mute failing test
2023-10-03 13:46:59 -04:00
Brian Seeders
4affa8d0a3
Enable Buildkite DRA workflows and delete Jenkins ones (#100213) (#100215)
(cherry picked from commit 1a969c3e67)

# Conflicts:
#	.ci/jobs.t/elastic+elasticsearch+dra-staging-update.yml
2023-10-03 13:31:28 -04:00
James Baiera
eaef3a9d1d
Validate enrich index before completing policy execution (#100106) (#100160)
This PR adds a validation step to the end of an enrich policy run to ensure the integrity of the
enrich index that is about to be promoted.

(cherry picked from commit 225db3190a)

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2023-10-03 11:25:43 -04:00
Mark Vieira
304781b1ec
Add Suse 15.5 to docker ignore list (#100156) (#100166)
# Conflicts:
#	.ci/dockerOnLinuxExclusions
2023-10-02 19:09:43 -04:00
James Baiera
364c340a73
Show concrete error when enrich index not exist rather than NPE (#99604) (#100155)
There should be NullPointerException check and throw index not found exception to the response
so the user can understand what happens with the enrich index

---------

Co-authored-by: James Baiera <james.baiera@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
(cherry picked from commit ccc896d128)

# Conflicts:
#	x-pack/plugin/enrich/src/main/java/org/elasticsearch/xpack/enrich/EnrichCache.java
#	x-pack/plugin/enrich/src/test/java/org/elasticsearch/xpack/enrich/EnrichCacheTests.java

Co-authored-by: puppylpg <shininglhb@163.com>
2023-10-02 16:54:07 -04:00
Brian Seeders
7288c24334
[buildkite] Use local SSDs for platform-support tests (#100098) (#100100)
(cherry picked from commit ab33eb0bdc)
2023-09-29 16:23:17 -04:00
Brian Seeders
21dfdaf083
[7.17] [buildkite] Add third-party tests to periodic pipeline (#99376) (#100095) 2023-09-29 16:11:46 -04:00
Brian Seeders
adb59f59d3
[7.17] [buildkite] Migrate pull-request pipelines from Jenkins (#99449) (#100091) 2023-09-29 15:27:18 -04:00
Albert Zaharovits
8857198c9a
[7.17] Update docker compose gradle plugin to 0.17.5 (#100079)
The previous docker compose plugin version trips on
the following docker-compose cli version

docker-compose version --short
2.22.0-desktop.2

(because of the trailing "desktop.2" part),
which is what's installed with the latest Docker Desktop on macOS.

Backport of #100059
2023-09-29 19:37:16 +03:00
James Rodewig
37ba70a17f
[7.17] [DOCS] Add security update to 7.17.13 release notes (#99949) (#100036) 2023-09-28 17:03:29 -04:00
Rene Groeschke
2a99b1d27c
Remove debug gradle output (#99959) 2023-09-27 18:34:52 +02:00
Rene Groeschke
5afd06ae57
[7.17] Update Gradle Wrapper to 8.2 (#96686) (#97484)
* Update Gradle Wrapper to 8.2 (#96686)

- Convention usage has been deprecated and was fixed in our build files
- Fix test dependencies and deprecation
2023-09-27 08:46:44 +02:00
Joe Gallo
4bec04037b
Provide better error messages from kv processor (#99493) (#99919) 2023-09-26 11:30:09 -04:00
Ryan Ernst
054e8219ca
Upgrade bundled JDK to Java 21 (#99724) (#99756)
Java 21 is now GA making Java 20 EOL.
2023-09-21 10:32:51 -04:00
Mark Vieira
0ae698b203
Update Gradle Enterprise plugin to 3.14.1 (#98551) (#99254) 2023-09-20 09:32:42 +02:00
Brian Seeders
5990f466c0
[buildkite] Add elastic-agent for monitoring buildkite agents (#99637) (#99675)
(cherry picked from commit b99702237f)
2023-09-19 11:35:53 -04:00
David Turner
84f632f254 Close expired search contexts on SEARCH thread (#99660)
In a production cluster, I observed the `[scheduler]` thread stuck for a
while trying to delete index files that became unreferenced while
closing a search context. We shouldn't be doing I/O on the scheduler
thread. This commit moves it to a `SEARCH` thread instead.
2023-09-19 15:26:26 +01:00
Brian Seeders
25474c3503
[buildkite] Add more memory to platform-support job agents (#99594) (#99596) 2023-09-14 15:33:37 -04:00
Simon Cooper
eb51d7b890
Fix deadlock between Cache.put and invalidateAll (#99480) (#99580)
The invalidateAll method is taking out the lru lock and segment locks in a different order to the put method, when the put is replacing an existing value. This results in a deadlock between the two methods as they try to swap locks. This fixes it by making sure invalidateAll takes out locks in the same order as put.

This is difficult to test because the put needs to be replacing an existing value, and invalidateAll clears the cache, resulting in subsequent puts not hitting the deadlock condition. A test that overrides some internal implementations to expose this particular deadlock will be coming later.
2023-09-14 11:27:04 -04:00