As an improvement from #15668 / #15700, rather than having one
dedicated side-car scheduling job per pipeline, we move to a single
scheduling job. Various pipelines that need triggering with different
schedules now live under each schedule in the new pipeline.
This reduces the amount of jobs we have to maintain in yaml.
This commit enhances the functionality introduced in #15668 and #15700
by allowing a single Buildkite scheduling job to trigger several
pipelines, in addition to multiple branches which it already does.
We rename the env var PIPELINE_TO_TRIGGER to PIPELINES_TO_TRIGGER
which now supports comma separate values.
This enhancement can be useful for pipelines like JDK matrix which
have variants (Linux and Windows) that we want to trigger with a single
scheduling job, thus reducing unnecessary entries in catalog-info.
This commit fixes a few bugs introduced in #15668 related to paths for
the calling script. We also stop limiting the execution only from the
main branch (to facilitate e.g. tests from PRs) and, finally, remove
the async clause, which is not needed, since by default BK steps are
run in parallel.
This commit is the first making use of #15627 to remove hard coded
branches for the DRA Snapshot build schedule.
With this pattern, we will only need to keep `ci/branches.json` up to date,
as versions evolve, and not need to update/maintain hard coded branches
in `catalog-info.yaml` anymore.
Once this is verified working, we'll add a corresponding schedule
pipeline (in `catalog.info`) for the JDK matrix job.
Relates: https://github.com/elastic/ingest-dev/issues/2664
Now that CI VM images are pre-provisioned with various flavors of
Java 21, we add the option for the corresponding CI job.
Adoptium 17 remains the default pre-selected option.
Relates https://github.com/elastic/ci-agent-images/pull/463
Introduces a new interface named SchedulerService to abstract from the ScheduledExecutorService to execute the DLQ flushes of segments. Abstracting from time provides a benefit in testing, where the test doesn't have to wait for things to happen, but those things could happen synchronously.
Update the test_plugins pipeline script to execute only the unit tests.
Use the vendored JRuby in every Ruby related duty, such as running `bundler` and `gem`.
Temporarily comments plugins that has needs to be fixed and already fails on their Travis CI.
Executes the testing of input tier1 plugins in VM instead of Buildkite agent.
In DLQ unit testing sometime the DLQ writer is started explicitly without starting the segments flushers. In such cases the test 's logs contains exceptions which could lead to think that the test fails silently.
Avoid to invoke scheduledFlusher's shutdown when it's not started (such behaviour is present only in tests).
This commit adds the compatibility tier for the Exhaustive tests suite.
Specifically, we introduce two new groups (running in parallel) for Linux and Windows compat tests.
Linux picks one OS per family from [^] and likewise Windows one of the three available choices from the same file.
We also support manual override, if user chooses to, by setting `LINUX_OS` or `WINDOWS_OS` as env vars in the Buildkite build prompt (in this case there is no randomization, and only one OS can be defined for Linux and Windows respectively).
For example:
```
LINUX_OS=rhel-9
WINDOWS_OS=windows=216
```
Relates:
- https://github.com/elastic/ingest-dev/issues/1722
[^1]: 4d6bd955e6/.buildkite/scripts/common/vm-images.json
This commit added support to add and remove multiple keystore keys in a single operation. It also fixed the empty value validation for editing existing key values and added ASCII validation for values.
The last remaining Jenkins job prior to BK migration is for
exhaustive tests. The compatibility phase seems to be failing
since 57dc14c92
with Java 17.0.2
This commit switches from OpenJDK 17 (whose last release was 17.0.2)
to AdoptiumJDK 17 which actively receives updates and is bundled
in the custom images used by Jenkins.
* Use Java installed BK agent and remove unnecessary git clone operation since repo is already cloned.
* Switch back to normal VM since Logstash BK agent doesn't support docker operations.
This commit adds a maximum default (global) timeout for every pipeline
definition (now that it's possible to define this programmatically in
an RRE).
The default values have been chosen arbitrarily based on intuition
about how much (in the worst case) we should wait for each job to
run until we consider them stuck/failed.
While at it, we update the yaml schema for RREs to point to the
latest commit (rather than a pinned commit that doesn't reflect the
latest changes, e.g. `maximum_default_timeout`).
Relates #15380
This removes the dependency on jackson's dataformat-yaml. Since there's only a single place where this library is used in core: to load the plugin alias definition, the code can be replaced by the underlying snakeyaml.
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
(cherry picked from commit 93d8a9da32)
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
This commit fixes the issue with Jackson > 2.15 and `log.format=json`: "Can not write a field name, expecting a value", by adding a default serializers for JRuby's objects.
This commit enables scheduled runs of the JDK matrix pipelines for
both Windows and Linux, once a week, Tuesdays at 1am UTC, using the
pipeline defaults for OS and JDK.
The last part of the Logstash JDK matrix CI migration from Jenkins to
Buildkite is AmazonLinux 2023.
While we have a working image[^1], this is the only step that requires
a agent that runs on AWS.
This commit refactors the builder to support GCP or AWS agents depending
on the OS.
[^1]: https://github.com/elastic/ci-agent-images/pull/441
Add missing yaml-language-server definition to Buildkite pipeline files
(static and dynamic generated) for consistency and to ease spotting
errors with editors.
This commit adds a global max timeout of 90min for the supported plugins
Buildkite pipeline. This prevents hanging builds (for 24hrs, which is
the default).
Relates: #15380
Build's maximum_timeout_in_minutes and default_timeout_in_minutes are available now through the catalog-info.yaml file.
As this change was made manually before we implemented this in RRE/Terrazzo, they got reverted to the default value (0); thus, I am raising this PR to get it as it was specified before the upgrade (120 mins.)