Commit graph

112 commits

Author SHA1 Message Date
Cas Donoghue
0df0e991da
Ensure observabilitySRE image is pushed on DRA staging (#17569)
The `artifactDockerObservabilitySRE` gradle task *always* produces a tag with a
`SNAPSHOT` postfix. In the staging pipeline we use the shared
`qualified-version` script for determining the LS version. That script correctly
handles conditionally adding a `SNAPSHOT` postfix which is important for the
tagging scheme for pushing to our container registry. Given the intermediate tag
produced by the gradle task is never pushed anywhere we can update the build
script to ensure the "local" artifact is always referenced with the `SNAPSHOT`
postfix.
2025-04-16 10:06:55 -07:00
Cas Donoghue
2b44f6b60e
Fix pull request pipeline definition for buildkite (#17552)
When the fedramp high feature branch was merged into 8.x the PR pipeline
accidentally duplicated the top level `steps` key. This was a mistake and is
causing issues generating exhaustive test pipeline definition. This commit fixes
the bug by ensuring there is a single `steps` key that defines all the steps in
the pipeline.
2025-04-14 10:15:12 -07:00
Cas Donoghue
0b1d29912a
Merge feature branch for observability SRE image creation into 8.x (#17541)
* Provision automatic test runs for ruby/java unit tests and integration tests with fips mode (#17029)

* Run ruby unit tests under FIPS mode

This commit shows a proposed pattern for running automated tests for logstash in
FIPS mode. It uses a new identifier in gradle for conditionally setting
properties to configure fips mode. The tests are run in a container
representative of the base image the final artifacts will be built from.

* Move everything from qa/fips -> x-pack

This commit moves test setup/config under x-pack dir.

* Extend test pipelines for fips mode to java unit tests and integration

* Add git to container for gradle

* move fips-mode gradle hooks to x-pack

* Skip license check for now

---------

Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>

* Split fips integration tests into two steps (#17038)

* Split fips integration tests into two steps

The integration tests suite takes about 40 minutes. This is far too slow for
reasonable feedback on a PR. This commit follows the pattern for the non-fips
integration tests whereby the tests are split into two sections that can run in
parallel across two steps. This should halve the feedback time.

The logic for getting a list of specs files to run has been extracted to a
shared shell script for use here and in the integration tests shell script.

* Use shared function for splitting integration tests

The logic for getting a list of specs to run has been extracted so that it can
be shared across fips and non fips integration test modes. This commit updates
the non fips integration tests to use the shared function.

* fix typo in helper name (kebab case, not snake)

* Escape $ so buildkite upload does not try to interpolate

* Wrap integration tests in shell script to avoid BK interpolation

* Move entrypoint for running integration tests inside docker

* Skip offline pack manager tests when running in fips mode (#17160)

This commit introduces a pattern for skipping tests we do not want to run in
fips mode. In this case the plugin manager tests rely on using
bundler/net-http/openssl which is not configured to be run with bouncycastle
fips providers.

* Get tests running in FIPS environment (#17096)

* Modify FIPS test runner environment for integration tests

This commit makes two small changes to the dockerfile used to define the fips
test environment. Specifically it adds curl (which is required by integration
tests), make (which is required by test setup), adds a c compiler (gcc and glibc
for integration tests which compile a small c program) and turns off debug ssl
logging as it is extremely noisy in logs and breaking some assumptions in
tests about logfile content.

Closes https://github.com/elastic/ingest-dev/issues/5074

* Do not run test env as root

The elastic stack is not meant to be run as root. This commit updates the test
environment to provision a non root user and have the container context execute
under that providioned user.

Closes https://github.com/elastic/ingest-dev/issues/5088

* Skip unit tests that reach out to rubygems for fips mode

The `update` test setup reaches out to rubygems with net/http which is
incompatible with our use of openssl in fips mode. This commit skips those tests
when running under fips.

See https://github.com/elastic/ingest-dev/issues/5071

* Work around random data request limits in BCFIPS

This commit changes test setup to make chunked calls to random data generation
in order to work around a limit in fips mode.

See https://github.com/elastic/ingest-dev/issues/5072 for details.

* Skip tests validating openssl defaults

Openssl will not be used when running under FIPS mode. The test setup and tests
themselves were failing when running in FIPS mode. This commit skips the tests
that are covering behavior that will be disabled.

See https://github.com/elastic/ingest-dev/issues/5069

* Skip tests that require pluginmanager to install plugins

This commit skips tests that rely on using the pluginmanager to install plugins
during tests which require reaching out to rubygems.

See https://github.com/elastic/ingest-dev/issues/5108

* Skip prepare offline pack integration tests in fips mode

The offline pack tests require on pluginmanager to use net-http library for
resolving deps. This will not operate under fips mode. Skip when running in fips
mode.

See https://github.com/elastic/ingest-dev/issues/5109

* Ensure a gem executible is on path for test setup

This commit modifies the generate-gems script to ensure that a `gem` executable
is on the path. If there is not one on the test runner, then use the one bundled
with vendored jruby.

* Skip webserver specs when running in FIPS mode

This commit skips the existing webserver tests. We have some options and need to
understand some requirements for the webserver functionality for fips mode. The
 https://github.com/elastic/ingest-dev/issues/5110 issue has a ton of details.

* Skip cli `remove` integration tests for FIPS

This commit skips tests that are running `remove` action for the pluginmanager.
These require reaching out to rubygems which is not available in FIPS mode.
These tests were added post initial integration tests scoping work but are
clearly requiring skips for FIPS mode.

* Add openssl package to FIPS testing env container

The setup script for filebeats requires an openssl executable. This commit
updates the testing container with this tool.

See https://github.com/elastic/ingest-dev/issues/5107

* Re-introduce retries for FIPS tests now that we are in a passing state

* Backport 17203 and 17267 fedramp8x (#17271)

* Pluginmanager clean after mutate (#17203)

* pluginmanager: always clean after mutate

* pluginmanager: don't skip updating plugins installed with --version

* pr feedback

(cherry picked from commit 8c96913807)

* Pluginmanager install preserve (#17267)

* tests: integration tests for pluginmanager install --preserve

* fix regression where pluginmanager's install --preserve flag didn't

* Add :skip_fips to update_spec.rb

* Run x-pack tests under FIPS mode (#17254)

This commit adds two new CI cells to cover x-pack tests running in FIPS mode.
This ensures we have coverage of these features when running existing x-pack
tests.

* observabilitySRE: docker rake tasks (#17272)

* observabilitySRE: docker rake tasks

* Apply suggestions from code review

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>

* Update rakelib/plugin.rake

* Update rakelib/plugin.rake

* Update docker/Makefile

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>

---------

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>

* Ensure env2yaml dep is properly expressed in observabilitySRE task (#17305)

The `build-from-local-observability-sre-artifacts` task depends on the `env2yaml`
task. This was easy to miss in local development if other images had been built.
This commit updates the makefile to properly define that dependency.

* Add a smoke test for observability SRE container (#17298)

* Add a smoke test for observability SRE container

Add a CI cell to ensure the observability contater is building successfully. In
order to show success run a quick smoke test to point out any glaring issues.

This adds some general, low risk plugins for doing quick testing. This will help
developers in debugging as we work on this image.

* Show what is happening when rake fails

* Debug deeper in the stack

Show the stdout/stderr when shelling out fails.

* Debug layers of build tooling

Open3 is not capturing stdout for some reason. Capture it and print to see what is wrong in CI.

* Actually run ls command in docker container 🤦

* Update safe_system based on code review suggestion

* Dynamically generate version for container invocation

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Refactor smoke test setup to script

Avoid interpolation backflips with buildkite by extracting to a script.

* Split out message surfacing improvment to separate PR.

Moved to: https://github.com/elastic/logstash/pull/17310

* Extract version qualifier into standalone script

* Wait for version-qualifier.sh script to land upstream

Use  https://github.com/elastic/logstash/pull/17311 once it lands and gets
backported to 8.x. For now just hard code version.

---------

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Configure observability SRE container for FIPS (#17297)

This commit establishes a pattern for configuring the container to run in fips mode.

- Use chainguard-fips
- Copy over java properties from ls tar archive
- Convert default jks to BC keystore
- Configure logstash to use java properties and FIPS config

NOTE: this assumes bouncycastle jars are in the tarball. The
https://github.com/elastic/ingest-dev/issues/5049 ticket will address that.

* Exclude plugin manager and keystore cli from observabilitySRE artifact (#17375)

* Conditionally install bcfips jars when building/testing observabilitySRE (#17359)

* Conditionally install bcfips jars when building for observabilitySRE

This commit implements a pattern for performing specific gradle tasks based on a
newly named "fedrampHighMode" option. This option is used to configure tests to
run with additional configuration specific to the observabilitySRE use case.
Similarly the additional jar dependencies for bouncycastle fips providers are
conditionally installed gated on the "fedrampHighMode" option.

In order to ensure the the "fedrampHighMode" option persists through the layers
of sub-processes spawned between gradle and rake we store and respect an
environment variable FEDRAMP_HIGH_MODE. This may be useful generally in building
the docker image.

Try codereview suggestion

* Use gradle pattern for setting properties with env vars

Gradle has a mechanism for setting properties with environment variables
prefixed with `ORG_GRADLE_PROJECT`. This commit updates the gradle tasks to use
that pattern.

See
https://docs.gradle.org/current/userguide/build_environment.html#setting_a_project_property
for details.

* Pull in latests commits from 8.x and update based on new patterns (#17385)

* Fix empty node stats pipelines (#17185) (#17197)

Fixed an issue where the `/_node/stats` API displayed empty pipeline metrics
when X-Pack monitoring was enabled

(cherry picked from commit 86785815bd)

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* Update z_rubycheck.rake to no longer inject Xmx1g (#17211)

This allows the environment variable JRUBY_OPTS to be used for setting properties like Xmx
original pr: #16420

(cherry picked from commit f562f37df2)

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* Improve warning for insufficient file resources for PQ max_bytes (#16656) (#17222)

This commit refactors the `PersistedQueueConfigValidator` class to provide a
more detailed, accurate and actionable warning when pipeline's PQ configs are at
risk of running out of disk space. See
https://github.com/elastic/logstash/issues/14839 for design considerations. The
highlights of the changes include accurately determining the free resources on a
filesystem disk and then providing a breakdown of the usage for each of the
paths configured for a queue.

(cherry picked from commit 062154494a)

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>

* gradle task migrate to the new artifacts-api (#17232) (#17236)

This commit migrates gradle task to the new artifacts-api

- remove dependency on staging artifacts
- all builds use snapshot artifacts
- resolve version from current branch, major.x, previous minor,
   with priority given in that order.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
(cherry picked from commit 0a745686f6)

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* tests: ls2ls delay checking until events have been processed (#17167) (#17252)

* tests: ls2ls delay checking until events have been processed

* Make sure upstream sends expected number of events before checking the expectation with downstream. Remove unnecessary or duplicated logics from the spec.

* Add exception handling in `wait_for_rest_api` to make wait for LS REST API retriable.

---------

Co-authored-by: Mashhur <mashhur.sattorov@elastic.co>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
(cherry picked from commit 73ffa243bf)

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Additional cleanify changes to ls2ls integ tests (#17246) (#17255)

* Additional cleanify changes to ls2ls integ tests: replace heartbeat-input with reload option, set queue drain to get consistent result.

(cherry picked from commit 1e06eea86e)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* [8.x] Reimplement LogStash::Numeric setting in Java (backport #17127) (#17273)

This is an automatic backport of pull request #17127 done by [Mergify](https://mergify.com).

----

* Reimplement LogStash::Numeric setting in Java (#17127)

Reimplements `LogStash::Setting::Numeric` Ruby setting class into the `org.logstash.settings.NumericSetting` and exposes it through `java_import` as `LogStash::Setting::NumericSetting`.
Updates the rspec tests:
- verifies `java.lang.IllegalArgumentException` instead of `ArgumentError` is thrown because the kind of exception thrown by Java code, during verification.

(cherry picked from commit 07a3c8e73b)

* Fixed reference of SettingNumeric class (on main modules were removed)

---------

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* [CI] Health report integration tests use the new artifacts-api (#17274) (#17277)

migrate to the new artifacts-api

(cherry picked from commit feb2b92ba2)

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* Backport 17203 and 17267 8.x (#17270)

* Pluginmanager clean after mutate (#17203)

* pluginmanager: always clean after mutate

* pluginmanager: don't skip updating plugins installed with --version

* pr feedback

(cherry picked from commit 8c96913807)

* Pluginmanager install preserve (#17267)

* tests: integration tests for pluginmanager install --preserve

* fix regression where pluginmanager's install --preserve flag didn't

* [Backport 8.x] benchmark script (#17283)

This commit cherry-picked the missing becnhmark script PRs
The deprecated artifacts-api is removed

[CI] benchmark uses the new artifacts-api (#17224)
[CI] benchmark readme (#16783)
Introduce a new flag to explicitly permit legacy monitoring (#16586) (Only take the benchmark script)
[ci] fix wrong queue type in benchmark marathon (#16465)
[CI] fix benchmark marathon (#16447)
[CI] benchmark dashboard and pipeline for testing against multiple versions (#16421)

* Fix pqcheck and pqrepair on Windows (#17210) (#17259)

A recent change to pqheck, attempted to address an issue where the
pqcheck would not on Windows mahcines when located in a folder containing
a space, such as "C:\program files\elastic\logstash". While this fixed an
issue with spaces in folders, it introduced a new issue related to Java options,
and the pqcheck was still unable to run on Windows.

This PR attempts to address the issue, by removing the quotes around the Java options,
which caused the option parsing to fail, and instead removes the explicit setting of
the classpath - the use of `set CLASSPATH=` in the `:concat` function is sufficient
to set the classpath, and should also fix the spaces issue

Fixes: #17209
(cherry picked from commit ba5f21576c)

Co-authored-by: Rob Bavey <rob.bavey@elastic.co>

* Shareable function for partitioning integration tests (#17223) (#17303)

For the fedramp high work https://github.com/elastic/logstash/pull/17038/files a
use case for multiple scripts consuming the partitioning functionality emerged.
As we look to more advanced partitioning we want to ensure that the
functionality will be consumable from multiple scripts.

See https://github.com/elastic/logstash/pull/17219#issuecomment-2698650296

(cherry picked from commit d916972877)

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>

* [8.x] Surface failures from nested rake/shell tasks (backport #17310) (#17317)

* Surface failures from nested rake/shell tasks (#17310)

Previously when rake would shell out the output would be lost. This
made debugging CI logs difficult. This commit updates the stack with
improved message surfacing on error.

(cherry picked from commit 0d931a502a)

# Conflicts:
#	rubyUtils.gradle

* Extend ruby linting tasks to handle file inputs (#16660)

This commit extends the gradle and rake tasks to pass through a list of files
for rubocop to lint. This allows more specificity and fine grained control for
linting when the consumer of the tasks only wishes to lint a select few files.

* Ensure shellwords library is loaded

Without this depending on task load order `Shellwords` may not be available.

---------

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>

* Forward Port of Release notes for `8.16.5` and `8.17.3` (#17187), (#17188) (#17266) (#17321)

* Forward Port of Release notes for 8.17.3 (#17187)

* Update release notes for 8.17.3

---------

Co-authored-by: logstashmachine <43502315+logstashmachine@users.noreply.github.com>
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>

* Forward Port of Release notes for 8.16.5 (#17188)

* Update release notes for 8.16.5

---------

Co-authored-by: logstashmachine <43502315+logstashmachine@users.noreply.github.com>
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: logstashmachine <43502315+logstashmachine@users.noreply.github.com>
(cherry picked from commit 63e8fd1d21)

Co-authored-by: Rob Bavey <rob.bavey@elastic.co>

* Add Deprecation tag to arcsight module (#17331)

* [8.x] Upgrade elasticsearch-ruby client. (backport #17161) (#17306)

* Upgrade elasticsearch-ruby client. (#17161)

* Fix Faraday removed basic auth option and apply the ES client module name change.

(cherry picked from commit e748488e4a)

* Apply the required changes in elasticsearch_client.rb after upgrading the elasticsearch-ruby client to 8.x

* Swallow the exception and make non-connectable client when ES client raises connection refuses exception.

---------

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Mashhur <mashhur.sattorov@elastic.co>

* Removed unused configHash computation that can be replaced by PipelineConfig.configHash() (#17336) (#17345)

Removed unused configHash computation happening in AbstractPipeline and used only in tests replaced by PipelineConfig.configHash() invocation

(cherry picked from commit 787fd2c62f)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* Use org.logstash.common.Util to hashing by default to SHA256 (#17346) (#17352)

Removes the usage fo Apache Commons Codec MessgeDigest to use internal Util class with embodies hashing methods.

(cherry picked from commit 9c0e50faac)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* Added test to verify the int overflow happen (#17353) (#17354)

Use long instead of int type to keep the length of the first token.

The size limit validation requires to sum two integers, one with the length of the accumulated chars till now plus the next fragment head part. If any of the two sizes is close to the max integer it generates an overflow and could successfully fail the test 9c0e50faac/logstash-core/src/main/java/org/logstash/common/BufferedTokenizerExt.java (L123).

To fall in this case it's required that sizeLimit is bigger then 2^32 bytes (2GB) and data fragments without any line delimiter is pushed to the tokenizer with a total size close to 2^32 bytes.

(cherry picked from commit afde43f918)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* [8.x] add ci shared qualified-version script (backport #17311) (#17348)

* add ci shared qualified-version script (#17311)

* ci: add shareable script for generating qualified version

* ci: use shared script to generate qualified version

(cherry picked from commit 10b5a84f84)

# Conflicts:
#	.buildkite/scripts/dra/build_docker.sh

* resolve merge conflict

---------

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>

* tests: make integration split quantity configurable (#17219) (#17367)

* tests: make integration split quantity configurable

Refactors shared splitter bash function to take a list of files on stdin
and split into a configurable number of partitions, emitting only those from
the currently-selected partition to stdout.

Also refactors the only caller in the integration_tests launcher script to
accept an optional partition_count parameter (defaulting to `2` for backward-
compatibility), to provide the list of specs to the function's stdin, and to
output relevant information about the quantity of partition splits and which
was selected.

* ci: run integration tests in 3 parts

(cherry picked from commit 3e0f488df2)

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>

* Update buildkite with new patterns from 8.x

This commit updates the buildkite definitions to be compatible with the
upstream 8.x branch. Specificially:
 - Split integration tests for fips into 3 runners.
 - Use the new shared bash helper for computing QUALIFIED_VERSION

It also continues standardization of using a "fedrampHighMode" for indicating
the tests should be running in the context of our custom image for the SRE team.

* Bug fix: Actually use shared integration_tests.sh file

After refactoring to use the same script, I forgot to actually use it
in the buildkite definition...

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
Co-authored-by: Mashhur <mashhur.sattorov@elastic.co>

* Pin rubocop-ast development gem due to new dep on prism (#17407) (#17433)

The rubocop-ast gem just introduced a new dependency on prism.
 - https://rubygems.org/gems/rubocop-ast/versions/1.43.0

In our install default gem rake task we are seeing issues trying to build native
extensions. I see that in upstream jruby they are seeing a similar problem (at
least it is the same failure mode https://github.com/jruby/jruby/pull/8415

This commit pins rubocop-ast to 1.42.0 which is the last version that did not
have an explicit prism dependency.

(cherry picked from commit 6de59f2c02)

Co-authored-by: Cas Donoghue <cas.donoghue@gmail.com>

* Add age filter fedramp (#17434)

* net-zero-change refactor

* add logstash-filter-age to observabilitySRE artifact

* Add licenses for bouncycastle fips jars (#17406)

This commit adds licences for bouncycastle jars that are added for the
observability SRE container artifact. It re-enables the previously disabled
license check and adds a new one running in fips mode.

* Publish Observability SRE images to internal container registry (#17401)

* POC for publishing observability SRE images

This commit adds a step to the pull_request_pipeline buildkite definition to
push a docker image to the elastic container registry. It is added here to show
that we have the proper creds etc in CI to push the container where it needs to
go. We will likely move this into the DRA pipeline once we are confident it is
pushing to the correct place with a naming convention that works for all
consumers/producers.

The general idea is to build the container with our gradle task, then once we
have that image we can tag it with the git sha and a "latest" identifier. This
would allow consumers to choose between an exact sha for a stream like 8.19.0 or
the "latest". I will also need to factor in the case where we have the tag
*without* the sha postfix. Obviously we will want to fold this in to the existing DRA
pipeline for building/staging images but for now it seems reasonable to handle
this separately.

* check variable resolution

* Move POC code into DRA pipeline

This commit takes the POC from the pull_request_pipeline and adds it to the DRA
pipeline. Noteably, we take care to not disrupt anything about the existing DRA
pipeline by making this wait until after the artifacts are published and we set
a soft_fail. While this is being introduced and stabilized we want to ensure the
existing DRA pipeline continues to work without interruption. As we get more
stability we can look at a tigther integration.

* Disambiguate architectures

Eventually we will want to do proper annotations with manifests but for now
just add arch to the tag.

* Use docker manifest for multi-architecture builds

This commit refactors the POC pipeline for pushing observabilty SRE containers
to handle conflicts for tags based on target architectures. Cells with
respective architectures build containers and push to the container registry
with a unique identifier. Once those exist we introduce a separate step to use
the docker manifest command to annotate those images such that a container
client can download the correct image based on architecture. As a result for
every artifact there will be 2 images pushed (one for each arch) and N manifests
pushed. The manifests will handle the final naming that the consumer would
expect.

* Refactor docker naming scheme

In order to follow more closely the existing tagging scheme this commit
refactors the naming for images to include the build sha BEFORE the SNAPSHOT
identifier. WHile this does not exactly follow the whole system that exists
today for container images in DRA it follows a pattern that is more similar.
Ideally we can iterate to fold handling of this container into DRA and in that
case consumers would not need to update their patterns for identifying images.

* Code review refactor

Rename INCLUDE_SHA to INCLUDE_COMMIT_ID in qualified-version script.
Confine use of this argument to individual invocations instead at top level in scripts.

* Build observabilitySRE containers after DRA is published

This gates build/push for observability SRE containers on success of DRA pipeline.

* x-pack: add fips validation plugin from x-pack (#16940)

* x-pack: add fips_validation plugin to be included in fips builds

The `logstash-integration-fips_validation` plugin provides no runtime
pipeline plugins, but instead provides hooks to ensure that the logstash
process is correctly configured for compliance with FIPS 140-3.

It is installed while building the observabilitySRE artifacts.

* fips validation: ensure BCFIPS,BCJSSE,SUN are first 3 security providers

* remove re-injection of BCFIPS jars

* Update lib/bootstrap/rubygems.rb

* add integration spec for fips_validation plugin

* add missing logstash_plugin helper

* fixup

* skip non-fips spec on fips-configured artifact, add spec details

* Improve smoke tests for observability SRE image (#17486)

* Improve smoke tests for observability SRE image

This commit adds a new rspec test to run the observability SRE container in a
docker compose network with filebeat and elasticsearch. It uses some simple test
data through a pipeline with plugins we expect to be used in production. The
rspec tests will ensure the test data is flowing from filebeat to logstash to
elasticsearch by querying elasticsearch for expected transformed data.

* REVERT ME: debug whats goig on in CI :(

* Run filebeat container as root

* Work around strict file ownership perms for filebeat

We add the filebeat config in a volume, the permissions checks fail due test
runner not being a root user. This commit disables that check in filebeat as
seems to be the consensus solution online for example: https://event-driven.io/en/tricks_on_how_to_set_up_related_docker_images/

* Dynaimcally generate PKI instead of checking it in

Instead of checking in PKI, dynamically generate it with gradle task for
starting containers and running the tests. This improvement avoids github
warning of checked in keys and avoid expiration headaches. Generation is very
fast and does not add any significant overhead to test setup.

* Remove use of "should" in rspec docstrings

see https://github.com/rubocop/rspec-style-guide?tab=readme-ov-file#should-in-example-docstrings

* Ensure permissions readable for volume

Now that certs are dynamically generated, ensure they are able to be read in container

* Use elasticsearch-fips image for smoke testing

* Add git ignore for temp certs

* Fix naming convention for integration tests

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>

* Use parameter expansion for FEDRAMP_HIGH_MODE

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>

* Use parameter expansion for FEDRAMP_HIGH_MODE

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>

* Use parameter expansion for FEDRAMP_HIGH_MODE

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>

---------

Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
Co-authored-by: Mashhur <mashhur.sattorov@elastic.co>

NOTE: we decided to squash these commits as the feature branch had cherry-picks (and squshed change sets 182f15ebde ) from 8.x which would potentially make the commit history confusing. We determined that the benefit of having individual commits from the feature branch was outweighed by the potentially confusing git history. This will also make porting this bit of work to other streams more simple.
2025-04-10 14:50:47 -07:00
mergify[bot]
d136e5066c
Fix JDK matrix pipeline after configurable it split (#17461) (#17510)
PR #17219 introduced configurable split quantities for IT tests, which
resulted in broken JDK matrix pipelines (e.g. as seen via the elastic
internal link:
https://buildkite.com/elastic/logstash-linux-jdk-matrix-pipeline/builds/444

reporting the following error

```
  File "/buildkite/builds/bk-agent-prod-k8s-1743469287077752648/elastic/logstash-linux-jdk-matrix-pipeline/.buildkite/scripts/jdk-matrix-tests/generate-steps.py", line 263
    def integration_tests(self, part: int, parts: int) -> JobRetValues:
    ^^^
SyntaxError: invalid syntax
There was a problem rendering the pipeline steps.
Exiting now.
```
)

This commit fixes the above problem, which was already fixed in #17642, using a more
idiomatic way.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
(cherry picked from commit b9469e0726)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
2025-04-08 17:02:37 +03:00
mergify[bot]
ba488e9e0f
[Backport 8.x] Fix syntax in BK CI script (#17462) (#17463)
(cherry picked from commit 422cd4e06b)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
2025-04-01 13:22:27 +02:00
mergify[bot]
18bd4e290b
tests: make integration split quantity configurable (#17219) (#17367)
* tests: make integration split quantity configurable

Refactors shared splitter bash function to take a list of files on stdin
and split into a configurable number of partitions, emitting only those from
the currently-selected partition to stdout.

Also refactors the only caller in the integration_tests launcher script to
accept an optional partition_count parameter (defaulting to `2` for backward-
compatibility), to provide the list of specs to the function's stdin, and to
output relevant information about the quantity of partition splits and which
was selected.

* ci: run integration tests in 3 parts

(cherry picked from commit 3e0f488df2)

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>
2025-03-20 05:34:56 -07:00
mergify[bot]
520c205d78
[8.x] add ci shared qualified-version script (backport #17311) (#17348)
* add ci shared qualified-version script (#17311)

* ci: add shareable script for generating qualified version

* ci: use shared script to generate qualified version

(cherry picked from commit 10b5a84f84)

# Conflicts:
#	.buildkite/scripts/dra/build_docker.sh

* resolve merge conflict

---------

Co-authored-by: Rye Biesemeyer <yaauie@users.noreply.github.com>
2025-03-19 13:46:38 -07:00
kaisecheng
4fd13730ee
[Backport 8.x] benchmark script (#17283)
This commit cherry-picked the missing becnhmark script PRs
The deprecated artifacts-api is removed

[CI] benchmark uses the new artifacts-api (#17224)
[CI] benchmark readme (#16783)
Introduce a new flag to explicitly permit legacy monitoring (#16586) (Only take the benchmark script)
[ci] fix wrong queue type in benchmark marathon (#16465)
[CI] fix benchmark marathon (#16447)
[CI] benchmark dashboard and pipeline for testing against multiple versions (#16421)
2025-03-07 00:16:29 +00:00
mergify[bot]
fcdda83f79
[CI] Health report integration tests use the new artifacts-api (#17274) (#17277)
migrate to the new artifacts-api

(cherry picked from commit feb2b92ba2)

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
2025-03-06 16:45:46 +00:00
github-actions[bot]
7fec532924
Add Windows 2025 to CI (#17133) (#17143)
This commit adds Windows 2025 to the Windows JDK matrix and exhaustive tests pipelines.

(cherry picked from commit 4d52b7258d)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
2025-02-24 16:49:19 +02:00
github-actions[bot]
e0aa026773
Allow capturing heap dumps in DRA BK jobs (#17081) (#17085)
This commit allows Buildkite to capture any heap dumps produced
during DRA builds.

(cherry picked from commit 78c34465dc)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
2025-02-14 09:37:13 +02:00
Dimitrios Liappis
c079236666
Fix conflicts (#17073) 2025-02-12 10:46:03 -08:00
Dimitrios Liappis
bc90250582
Backport 16907 to 8.x: Use --qualifier in release manager (#16907) (#16913)
This commit uses the new --qualifier parameter in the release manager
for publishing dra artifacts. Additionally, simplifies the expected
variables to rely on a simple `VERSION_QUALIFIER`.

Snapshot builds are skipped when VERSION_QUALIFIER is set.
Finally, for helping to test DRA PRs, we also allow passing the `DRA_BRANCH`  option/env var
to override BUILDKITE_BRANCH.

Closes https://github.com/elastic/ingest-dev/issues/4856

Backported from #16907 cherry picked from 9385cfac5a
2025-01-20 19:41:04 +02:00
github-actions[bot]
094724cd72
Increase Xmx used by JRuby during Rake execution to 4Gb (#16911) (#16914)
(cherry picked from commit 58e6dac94b)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
2025-01-20 17:51:07 +01:00
github-actions[bot]
b242715f76
[CI] Change agent for JDK availability check and add schedule also for 8.x (#16614) (#16617)
Switch execution agent of JDK availability check pipeline from vm-agent to container-agent.
Moves the schedule definition from the `Logstash Pipeline Scheduler` pipeline into the pipeline definition, adding a schedule also for `8.x` branch.

(cherry picked from commit c602b851bf)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
2024-10-30 12:52:03 +01:00
github-actions[bot]
51851a99d3
Use jvm catalog for reproducible builds and expose new pipeline to check JDK availability (#16602) (#16609)
Updates the existing `createElasticCatalogDownloadUrl` method to use the precise version retrieved `versions.yml` to download the JDK instead of using the latest of major version. This makes the build reproducible again.
Defines a new Gradle `checkNewJdkVersion` task to check if there is a new JDK version available from JVM catalog matching the same major of the current branch.
Creates a new Buildkite pipeline to execute a `bash` script to run the Gradle task; plus it also update the `catalog-info.yaml` with the new pipeline and a trigger to execute every week.

(cherry picked from commit ed5874bc27)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
2024-10-30 12:15:08 +01:00
github-actions[bot]
ad7c61448f
Health api minor followups (#16533) (#16534)
* Utilize default agent for Health API CI. Call python scripts from directly CI step.

* Change BK agent to support both Java and python. Install pip manually and send env vars to subprocess.

(cherry picked from commit 4037adfc4a)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
2024-10-10 15:25:53 -07:00
Ry Biesemeyer
7eb5185b4e
Feature: health report api (#16520)
* [health] bootstrap HealthObserver from agent to API (#16141)

* [health] bootstrap HealthObserver from agent to API

* specs: mocked agent needs health observer

* add license headers

* Merge `main` into `feature/health-report-api` (#16397)

* Add GH vault plugin bot to allowed list (#16301)

* regenerate webserver test certificates (#16331)

* correctly handle stack overflow errors during pipeline compilation (#16323)

This commit improves error handling when pipelines that are too big hit the Xss limit and throw a StackOverflowError. Currently the exception is printed outside of the logger, and doesn’t even show if log.format is json, leaving the user to wonder what happened.

A couple of thoughts on the way this is implemented:

* There should be a first barrier to handle pipelines that are too large based on the PipelineIR compilation. The barrier would use the detection of Xss to determine how big a pipeline could be. This however doesn't reduce the need to still handle a StackOverflow if it happens.
* The catching of StackOverflowError could also be done on the WorkerLoop. However I'd suggest that this is unrelated to the Worker initialization itself, it just so happens that compiledPipeline.buildExecution is computed inside the WorkerLoop class for performance reasons. So I'd prefer logging to not come from the existing catch, but from a dedicated catch clause.

Solves #16320

* Doc: Reposition worker-utilization in doc (#16335)

* settings: add support for observing settings after post-process hooks (#16339)

Because logging configuration occurs after loading the `logstash.yml`
settings, deprecation logs from `LogStash::Settings::DeprecatedAlias#set` are
effectively emitted to a null logger and lost.

By re-emitting after the post-process hooks, we can ensure that they make
their way to the deprecation log. This change adds support for any setting
that responds to `Object#observe_post_process` to receive it after all
post-processing hooks have been executed.

Resolves: elastic/logstash#16332

* fix line used to determine ES is up (#16349)

* add retries to snyk buildkite job (#16343)

* Fix 8.13.1 release notes (#16363)

make a note of the fix that went to 8.13.1: #16026

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>

* Update logstash_releases.json (#16347)

* [Bugfix] Resolve the array and char (single | double quote) escaped values of ${ENV} (#16365)

* Properly resolve the values from ENV vars if literal array string provided with ENV var.

* Docker acceptance test for persisting  keys and use actual values in docker container.

* Review suggestion.

Simplify the code by stripping whitespace before `gsub`, no need to check comma and split.

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

---------

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Doc: Add SNMP integration to breaking changes (#16374)

* deprecate java less-than 17 (#16370)

* Exclude substitution refinement on pipelines.yml (#16375)

* Exclude substitution refinement on pipelines.yml (applies on ENV vars and logstash.yml where env2yaml saves vars)

* Safety integration test for pipeline config.string contains ENV .

* Doc: Forwardport 8.15.0 release notes to main (#16388)

* Removing 8.14 from ci/branches.json as we have 8.15. (#16390)

---------

Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Squashed merge from 8.x

* Failure injector plugin implementation. (#16466)

* Test purpose only failure injector integration (filter and output) plugins implementation. Add unit tests and include license notes.

* Fix the degrate method name typo.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* Add explanation to the config params and rebuild plugin gem.

---------

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* Health report integration tests bootstrapper and initial tests implementation (#16467)

* Health Report integration tests bootstrapper and initial slow start scenario implementation.

* Apply suggestions from code review

Renaming expectation check method name.

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* Changed to branch concept, YAML structure simplified as changed to Dict.

* Apply suggestions from code review

Reflect `help_url` to the integration test.

---------

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* health api: expose `GET /_health_report` with pipelines/*/status probe (#16398)

Adds a `GET /_health_report` endpoint with per-pipeline status probes, and wires the
resulting report status into the other API responses, replacing their hard-coded `green`
with a meaningful status indication.

---------

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* docs: health report API, and diagnosis links (feature-targeted) (#16518)

* docs: health report API, and diagnosis links

* Remove plus-for-passthrough markers

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

---------

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* merge 8.x into feature branch... (#16519)

* Add GH vault plugin bot to allowed list (#16301)

* regenerate webserver test certificates (#16331)

* correctly handle stack overflow errors during pipeline compilation (#16323)

This commit improves error handling when pipelines that are too big hit the Xss limit and throw a StackOverflowError. Currently the exception is printed outside of the logger, and doesn’t even show if log.format is json, leaving the user to wonder what happened.

A couple of thoughts on the way this is implemented:

* There should be a first barrier to handle pipelines that are too large based on the PipelineIR compilation. The barrier would use the detection of Xss to determine how big a pipeline could be. This however doesn't reduce the need to still handle a StackOverflow if it happens.
* The catching of StackOverflowError could also be done on the WorkerLoop. However I'd suggest that this is unrelated to the Worker initialization itself, it just so happens that compiledPipeline.buildExecution is computed inside the WorkerLoop class for performance reasons. So I'd prefer logging to not come from the existing catch, but from a dedicated catch clause.

Solves #16320

* Doc: Reposition worker-utilization in doc (#16335)

* settings: add support for observing settings after post-process hooks (#16339)

Because logging configuration occurs after loading the `logstash.yml`
settings, deprecation logs from `LogStash::Settings::DeprecatedAlias#set` are
effectively emitted to a null logger and lost.

By re-emitting after the post-process hooks, we can ensure that they make
their way to the deprecation log. This change adds support for any setting
that responds to `Object#observe_post_process` to receive it after all
post-processing hooks have been executed.

Resolves: elastic/logstash#16332

* fix line used to determine ES is up (#16349)

* add retries to snyk buildkite job (#16343)

* Fix 8.13.1 release notes (#16363)

make a note of the fix that went to 8.13.1: #16026

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>

* Update logstash_releases.json (#16347)

* [Bugfix] Resolve the array and char (single | double quote) escaped values of ${ENV} (#16365)

* Properly resolve the values from ENV vars if literal array string provided with ENV var.

* Docker acceptance test for persisting  keys and use actual values in docker container.

* Review suggestion.

Simplify the code by stripping whitespace before `gsub`, no need to check comma and split.

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

---------

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Doc: Add SNMP integration to breaking changes (#16374)

* deprecate java less-than 17 (#16370)

* Exclude substitution refinement on pipelines.yml (#16375)

* Exclude substitution refinement on pipelines.yml (applies on ENV vars and logstash.yml where env2yaml saves vars)

* Safety integration test for pipeline config.string contains ENV .

* Doc: Forwardport 8.15.0 release notes to main (#16388)

* Removing 8.14 from ci/branches.json as we have 8.15. (#16390)

* Increase Jruby -Xmx to avoid OOM during zip task in DRA (#16408)

Fix: #16406

* Generate Dataset code with meaningful fields names (#16386)

This PR is intended to help Logstash developers or users that want to better understand the code that's autogenerated to model a pipeline, assigning more meaningful names to the Datasets subclasses' fields.

Updates `FieldDefinition` to receive the name of the field from construction methods, so that it can be used during the code generation phase, instead of the existing incremental `field%n`.
Updates `ClassFields` to propagate the explicit field name down to the `FieldDefinitions`.
Update the `DatasetCompiler` that add fields to `ClassFields` to assign a proper name to generated Dataset's fields.

* Implements safe evaluation of conditional expressions, logging the error without killing the pipeline (#16322)

This PR protects the if statements against expression evaluation errors, cancel the event under processing and log it.
This avoids to crash the pipeline which encounter a runtime error during event condition evaluation, permitting to debug the root cause reporting the offending event and removing from the current processing batch.

Translates the `org.jruby.exceptions.TypeError`, `IllegalArgumentException`, `org.jruby.exceptions.ArgumentError` that could happen during `EventCodition` evaluation into a custom `ConditionalEvaluationError` which bubbles up on AST tree nodes. It's catched in the `SplitDataset` node.
Updates the generation of the `SplitDataset `so that the execution of `filterEvents` method inside the compute body is try-catch guarded and defer the execution to an instance of `AbstractPipelineExt.ConditionalEvaluationListener` to handle such error. In this particular case the error management consist in just logging the offending Event.


---------

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>

* Update logstash_releases.json (#16426)

* Release notes for 8.15.1 (#16405) (#16427)

* Update release notes for 8.15.1

* update release note

---------

Co-authored-by: logstashmachine <43502315+logstashmachine@users.noreply.github.com>
Co-authored-by: Kaise Cheng <kaise.cheng@elastic.co>
(cherry picked from commit 2fca7e39e8)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fix ConditionalEvaluationError to do not include the event that errored in its serialiaxed form, because it's not expected that this class is ever serialized. (#16429) (#16430)

Make inner field of ConditionalEvaluationError transient to be avoided during serialization.

(cherry picked from commit bb7ecc203f)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* use gnu tar compatible minitar to generate tar artifact (#16432) (#16434)

Using VERSION_QUALIFIER when building the tarball distribution will fail since Ruby's TarWriter implements the older POSIX88 version of tar and paths will be longer than 100 characters.

For the long paths being used in Logstash's plugins, mainly due to nested folders from jar-dependencies, we need the tarball to follow either the 2001 ustar format or gnu tar, which is implemented by the minitar gem.

(cherry picked from commit 69f0fa54ca)

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* account for the 8.x in DRA publishing task (#16436) (#16440)

the current DRA publishing task computes the branch from the version
contained in the version.yml

This is done by taking the major.minor and confirming that a branch
exists with that name.

However this pattern won't be applicable for 8.x, as that branch
currently points to 8.16.0 and there is no 8.16 branch.

This commit falls back to reading the buildkite injected
BUILDKITE_BRANCH variable.

(cherry picked from commit 17dba9f829)

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Fixes the issue where LS wipes out all quotes from docker env variables. (#16456) (#16459)

* Fixes the issue where LS wipes out all quotes from docker env variables. This is an issue when running LS on docker with CONFIG_STRING, needs to keep quotes with env variable.

* Add a docker acceptance integration test.

(cherry picked from commit 7c64c7394b)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Known issue for 8.15.1 related to env vars references (#16455) (#16469)

(cherry picked from commit b54caf3fd8)

Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>

* bump .ruby_version to jruby-9.4.8.0 (#16477) (#16480)

(cherry picked from commit 51cca7320e)

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Release notes for 8.15.2 (#16471) (#16478)

Co-authored-by: andsel <selva.andre@gmail.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
(cherry picked from commit 01dc76f3b5)

* Change LogStash::Util::SubstitutionVariables#replace_placeholders refine argument to optional (#16485) (#16488)

(cherry picked from commit 8368c00367)

Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>

* Use jruby-9.4.8.0 in exhaustive CIs. (#16489) (#16491)

(cherry picked from commit fd1de39005)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Don't use an older JRuby with oraclelinux-7 (#16499) (#16501)

A recent PR (elastic/ci-agent-images/pull/932) modernized the VM images
and removed JRuby 9.4.5.0 and some older versions.

This ended up breaking exhaustive test on Oracle Linux 7 that hard coded
JRuby 9.4.5.0.

PR https://github.com/elastic/logstash/pull/16489 worked around the
problem by pinning to the new JRuby, but actually we don't
need the conditional anymore since the original issue
https://github.com/jruby/jruby/issues/7579#issuecomment-1425885324 has
been resolved and none of our releasable branches (apart from 7.17 which
uses `9.2.20.1`) specify `9.3.x.y` in `/.ruby-version`.

Therefore, this commit removes conditional setting of JRuby for
OracleLinux 7 agents in exhaustive tests (and relies on whatever
`/.ruby-version` defines).

(cherry picked from commit 07c01f8231)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>

* Improve pipeline bootstrap error logs (#16495) (#16504)

This PR adds the cause errors details on the pipeline converge state error logs

(cherry picked from commit e84fb458ce)

Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>

* Logstash Health Report Tests Buildkite pipeline setup. (#16416) (#16511)

(cherry picked from commit 5195332bc6)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Make health report test runner script executable. (#16446) (#16512)

(cherry picked from commit 2ebf2658ff)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Backport PR #16423 to 8.x: DLQ-ing events that trigger an conditional evaluation error. (#16493)

* DLQ-ing events that trigger an conditional evaluation error. (#16423)

When a conditional evaluation encounter an error in the expression the event that triggered the issue is sent to pipeline's DLQ, if enabled for the executing pipeline.

This PR engage with the work done in #16322, the `ConditionalEvaluationListener` that is receives notifications about if-statements evaluation failure, is improved to also send the event to DLQ (if enabled in the pipeline) and not just logging it.

(cherry picked from commit b69d993d71)

* Fixed warning about non serializable field DeadLetterQueueWriter in serializable AbstractPipelineExt

---------

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* add deprecation log for `--event_api.tags.illegal` (#16507) (#16515)

- move `--event_api.tags.illegal` from option to deprecated_option
- add deprecation log when the flag is explicitly used
relates: #16356

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
(cherry picked from commit a4eddb8a2a)

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

---------

Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>
Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>

---------

Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>
Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
2024-10-09 09:48:12 -07:00
github-actions[bot]
3b751d9794
Make health report test runner script executable. (#16446) (#16512)
(cherry picked from commit 2ebf2658ff)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
2024-10-05 07:32:15 -07:00
github-actions[bot]
2f3f6a9651
Logstash Health Report Tests Buildkite pipeline setup. (#16416) (#16511)
(cherry picked from commit 5195332bc6)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
2024-10-05 07:30:51 -07:00
github-actions[bot]
476b9216f2
Don't use an older JRuby with oraclelinux-7 (#16499) (#16501)
A recent PR (elastic/ci-agent-images/pull/932) modernized the VM images
and removed JRuby 9.4.5.0 and some older versions.

This ended up breaking exhaustive test on Oracle Linux 7 that hard coded
JRuby 9.4.5.0.

PR https://github.com/elastic/logstash/pull/16489 worked around the
problem by pinning to the new JRuby, but actually we don't
need the conditional anymore since the original issue
https://github.com/jruby/jruby/issues/7579#issuecomment-1425885324 has
been resolved and none of our releasable branches (apart from 7.17 which
uses `9.2.20.1`) specify `9.3.x.y` in `/.ruby-version`.

Therefore, this commit removes conditional setting of JRuby for
OracleLinux 7 agents in exhaustive tests (and relies on whatever
`/.ruby-version` defines).

(cherry picked from commit 07c01f8231)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
2024-10-02 19:57:58 +03:00
github-actions[bot]
2c024daecd
Use jruby-9.4.8.0 in exhaustive CIs. (#16489) (#16491)
(cherry picked from commit fd1de39005)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
2024-10-02 09:33:31 +01:00
github-actions[bot]
d2b19001de
account for the 8.x in DRA publishing task (#16436) (#16440)
the current DRA publishing task computes the branch from the version
contained in the version.yml

This is done by taking the major.minor and confirming that a branch
exists with that name.

However this pattern won't be applicable for 8.x, as that branch
currently points to 8.16.0 and there is no 8.16 branch.

This commit falls back to reading the buildkite injected
BUILDKITE_BRANCH variable.

(cherry picked from commit 17dba9f829)

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
2024-09-10 10:56:27 +01:00
kaisecheng
6e93b30c7f
Increase Jruby -Xmx to avoid OOM during zip task in DRA (#16408)
Fix: #16406
2024-08-28 11:10:21 +01:00
João Duarte
629d8fe5a8
add retries to snyk buildkite job (#16343) 2024-07-29 12:00:43 +01:00
ev1yehor
e065088cd8
Add GH vault plugin bot to allowed list (#16301) 2024-07-16 14:38:56 +03:00
Dimitrios Liappis
f728c44a0a
Remove Debian 10 from CI (#16300)
This commit removes Debian 10 (Buster) which is EOL
since July 1 2024[^1] from CI.

Relates https://github.com/elastic/ingest-dev/issues/2872
2024-07-10 15:17:10 +03:00
kaisecheng
2404bad9a9
[CI] fix benchmark to pull snapshot version (#16308)
- fixes the CI benchmark script to always runs against the latest snapshot version
- uses `/v1/versions/$VERSION/builds/latest` to get the latest build id

Fixes: #16307

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
2024-07-08 22:20:59 +01:00
Dimitrios Liappis
ea0c16870f
Add Ubuntu 24.04 to CI (#16299)
Now that we have custom VM images for Ubuntu 24.04, this commit adds
CI for Ubuntu 24.04.

This is a revert of #16279
2024-07-08 14:43:55 +03:00
Dimitrios Liappis
db06ec415a
Remove CentOS 7 from CI (#16293)
CentOS 7 is EOL since June 30 2024[^1]. All repositories and mirrors are
now unreachable.

This commit removes CentOS 7 from CI jobs using it.

Relates https://github.com/elastic/ingest-dev/issues/3520

[^1]: https://www.redhat.com/en/topics/linux/centos-linux-eol
2024-07-04 14:13:16 +03:00
João Duarte
a046d3f273
Revert "add ubuntu 24.04 to CI (#16263)" (#16279)
This reverts commit a0bcd61ad3.
2024-07-02 17:45:50 +01:00
João Duarte
a0bcd61ad3
add ubuntu 24.04 to CI (#16263) 2024-07-02 14:34:58 +01:00
Dimitrios Liappis
7080ec5427
Add retries to aarch64 CI pipeline (#16271)
Add retries in the aarch64 CI pipeline to reduce noise from transient
network failures.

Closes https://github.com/elastic/ingest-dev/issues/3510
2024-07-01 12:49:26 +03:00
João Duarte
0e1d67eda9
produce wolfi docker image in ci (#16252) 2024-06-26 13:50:47 +01:00
kaisecheng
440aa98e48
[CI] Benchmark pipeline (#16191)
Add a buildkite pipeline to do benchmark.
The script does benchmark by running Filebeats (docker) -> Logstash (docker) -> ES Cloud.
Logstash metrics and benchmark results are sent to the same ES Cloud.
- Secrets store in vault `secret/ci/elastic-logstash/benchmark`
- Use flog (docker) to generate ~2GB logs
- Pull the snapshot docker image of the main branch every day
- Logstash runs two pipelines, main and node_stats
  - The main pipeline handles beats ingestion, sending data to the data stream `logs-generic-default`
    - It runs for all combinations. (pq + mq) x worker x batch size
    - Each test runs for ~7 minutes
  - The node_stats pipeline retrieves /_node/stats API every 30s and sends it to the data stream `metrics-nodestats-logstash`
- The script sends a summary of EPS and resource usage to index `benchmark_summary`

The buildkite pipeline accepts ENV variables to customize the test
| Variable Name   | Default Value       | Comment                                            |
|-----------------|---------------------|----------------------------------------------------|
| FB_VERSION      | 8.13.4              | docker tag                                         |
| LS_VERSION      |                     | docker tag                                         |
| LS_JAVA_OPTS    | -Xmx2g              | by default, Xmx is set to half of memory           |
| MULTIPLIERS     | 2,4,6               | determine the number of workers (cpu * multiplier) |
| BATCH_SIZES     | 125,1000            |                                                    |
| CPU             | 4                   | number of cpu for Logstash container               |
| MEM             | 4                   | number of GB for Logstash container                |
| QTYPE           | memory              | queue type to test -- persisted; memory; all       |
| FB_CNT          | 4                   | number of filebeats to use in benchmark            |

To check the result
- `vault read secret/ci/elastic-logstash/benchmark` to get the host and credentials
- `curl -u "$ES_USER:$ES_PW" "$ES_HOST/benchmark_summary/_search"`

Fixes: https://github.com/elastic/ingest-dev/issues/3377
2024-06-21 22:48:34 +01:00
ev1yehor
0d385a9611
Update pull-requests.json (#16220) 2024-06-20 13:52:35 +03:00
João Duarte
1484614405
Wolfi-based image flavor (#16189)
* Add wolfi as an option to the build process
* Add docker acceptance tests for the wolfi image
* Change how tests are done on the java process, due to "ps -C" not being available on wolfi

replaces and closes https://github.com/elastic/logstash/pull/16116

Co-authored-by: Andres Rodriguez <andreserl@gmail.com>
2024-06-17 15:48:02 +01:00
kaisecheng
1d4038b27f
Add initial buildkite pipeline for Benchmark (#16190)
skeleton pipeline for benchmark
2024-05-31 15:17:50 +01:00
Mashhur
4a379be6d5
Fix the git branch check for snyk bk jobs (#16062)
* Replace 'git show-ref' with 'git rev-parse' to fix the issue where show-ref is not working as expected.
* Use git checkout instead 'git rev-parse'.
* Apply prune dependencies recommended for big projects (like we have multi gradle projects) by Snyk.
* Apply prune repeated dependency option directly to snyk monitor.
* Avoid the exit, continue scanning to the end.
* Remove the debugging.
2024-04-08 11:34:26 +01:00
Dimitrios Liappis
c0c213d17e
Split java/ruby unit test steps on Windows (#15888)
As a follow up to #15861 this commit splits the current unit tests step
for the Windows JDK matrix pipeline to two that run
Java and Ruby unit tests separately.

Closes https://github.com/elastic/logstash/issues/15566
2024-03-11 09:27:11 +02:00
Andres Rodriguez
8eb08e1382
Add Alma 8, Alma 9, and Rocky Linux 9 to the JDK matrix (#15941) 2024-02-13 11:01:21 -05:00
Dimitrios Liappis
2fc3f4c21f
Add retries to acceptance/docker steps in BK (#15901)
Similarly to #15874, this commit adds retries
to another group, the acceptance/docker to reduce
build noise from transient issues.
2024-02-06 15:10:13 +02:00
Dimitrios Liappis
fedcf58c48
Add Debian 12 to CI (#15895)
This commit adds Debian 12 (Bookworm) to the
Linux JDK matrix pipeline and Compat Phase of the
exhaustive pipeline respectively.

Relates https://github.com/elastic/ingest-dev/issues/2871
2024-02-05 18:49:30 +02:00
Dimitrios Liappis
8ac55184b8
Allow running Java+Ruby tests on Windows separately (#15861)
This commit allows separate running of Java and Ruby tests on Windows i.e. the same way as we currently do on unix (unit_tests.sh) via a cli argument.
If no argument has been supplied, both tests are run (as it does now).

The wrapper script is also rewritten from old batch style script to Powershell.

This work allows us to split the existing Windows CI job in a subsequent PR to separate steps, as we currently do on Linux.

Relates: https://github.com/elastic/logstash/issues/15566
2024-02-01 10:04:25 +02:00
Dimitrios Liappis
3b747d86b8
Add retries to JDK matrix pipeline steps (#15877)
This commit adds retries to the steps of the Linux + Windows JDK matrix
pipeline steps to avoid notification noise due to transient network
errors.
2024-01-30 18:02:57 +02:00
Dimitrios Liappis
88a32cca81
Add BK retries to exhaustive/compat steps (#15874)
As a follow up to #15787 we also add Buildkite retries for the
exhaustive pipeline / compatibility group steps to prevent
failures due to flakiness.
2024-01-30 14:33:33 +02:00
Dimitrios Liappis
5ee75803f8
Scheduled runs of exhaustive and aarch64 pipelines (#15850)
This commit adds a schedule to run the exhaustive pipeline
(biweekly, every other Wednesday @2AM UTC) and the aarch64
(weekly, every Monday@2AM UTC).

Closes https://github.com/elastic/ingest-dev/issues/2852
2024-01-28 19:30:02 +02:00
Dimitrios Liappis
3f5b44a1ad
Remove Ubuntu 18.04 from CI jobs (#15855)
Relates https://github.com/elastic/ingest-dev/issues/2849
2024-01-26 17:14:41 +02:00
Dimitrios Liappis
0d808ed708
Retries for serverless-integration-testing pipeline (#15851)
This commit adds (up to 3) retries for all steps of the `serverless-integration-testing`
pipeline as a stop-gap measure to prevent network related transient failures.
2024-01-25 17:24:00 +02:00
Dimitrios Liappis
c5cb1fe2ed
Annotate successful DRA builds with summary URL (#15820)
This commit makes the generated DRA URL easily accessible via
a Buildkite annotation.

Closes https://github.com/elastic/ingest-dev/issues/2608
2024-01-22 16:37:18 +02:00