Commit graph

1625 commits

Author SHA1 Message Date
João Duarte
e36cacedc8
ensure inputSize state value is reset during buftok.flush (#16760) 2024-12-09 09:10:54 -08:00
Ry Biesemeyer
202d07cbbf
ensure jackson overrides are available to static initializers (#16719)
Moves the application of jackson defaults overrides into pure java, and
applies them statically _before_ the `org.logstash.ObjectMappers` has a chance
to start initializing object mappers that rely on the defaults.

We replace the runner's invocation (which was too late to be fully applied) with
a _verification_ that the configured defaults have been applied.
2024-12-04 14:27:26 -08:00
Rob Bavey
e3265d93e8
Pin jar-dependencies to 0.4.1 (#16747)
Pin jar-dependencies to `0.4.1`, until https://github.com/jruby/jruby/issues/7262
is resolved.
2024-12-03 17:35:00 -05:00
Cas Donoghue
eb7e1253e0
Revert "Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit (#16482)" (#16715)
This reverts commit 85493ce864.
2024-11-21 09:18:59 -08:00
github-actions[bot]
0dd64a9d63
PipelineBusV2 deadlock proofing (#16671) (#16682)
* pipeline bus: add deadlock test for unlisten/unregisterSender

* pipeline bus: eliminate deadlock

Moves the sync-to-notify out of the `AddressStateMapping#mutate`'s effective
synchronous block to eliminate a race condition where unlistening to an address
and unregistering a sender could deadlock.

It is safe to notify an AddressState's attached input without exclusive access
to the AddressState, because notifying an input that has since been disconnected
is net zero harm.

(cherry picked from commit 8af6343a26)

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
2024-11-20 13:26:13 -08:00
Andrea Selva
d4fb06e498
Introduce a new flag to explicitly permit legacy monitoring (#16586)
Introduce a new flag setting `xpack.monitoring.allow_legacy_collection` which eventually enable the legacy monitoring collector.

Update the method to test if monitoring is enabled so that consider also `xpack.monitoring.allow_legacy_collection` to determine if `monitoring.*` settings are valid or not.
By default it's false, the user has to intentionally enable it to continue to use the legacy monitoring settings.


---------

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
2024-11-19 08:52:28 +01:00
Andrea Selva
a94659cf82
Default buffer type to 'heap' for 9.0 (#16500)
Switch the default value of `pipeline.buffer.type` to use the heap memory instead of direct one.

Change the default value of the setting `pipeline.buffer.type` from direct to heap and update consequently the documentation.

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
2024-11-13 16:58:56 +01:00
kaisecheng
5847d77331
skip allow_superuser in Windows OS (#16629)
As user id is always zero in Windows,
this commit excluded the checking of running as root in Windows.
2024-11-05 15:37:19 +00:00
kaisecheng
db59cd0fbd
set allow_superuser to false as default (#16558)
- set allow_superuser as false by default for v9
- change the buildkite image of ruby unit test to non-root
2024-10-31 13:33:00 +00:00
João Duarte
ca19f0029e
make max inflight warning global to all pipelines (#16597)
The current max inflight error message focuses on a single pipeline and on a maximum amount of 10k events regardless of the heap size.

The new warning will take into account all loaded pipelines and also consider the heap size, giving a warning if the total number of events consumes 10% or more of the total heap.

For the purpose of the warning events are assumed to be 2KB as it a normal size for a small log entry.
2024-10-25 14:44:33 +01:00
kaisecheng
566bdf66fc
remove http.* settings (#16552)
The commit removes the deprecated settings http.port, http.host, http.enabled, and http.environment
2024-10-18 12:15:38 +01:00
kaisecheng
467ab3f44b
Enable log.format.json.fix_duplicate_message_fields by default (#16578)
Set `log.format.json.fix_duplicate_message_fields` to `true` as default
to avoid collision of the message field in log lines when log.format is JSON
2024-10-17 16:22:30 +01:00
kaisecheng
3f0ad12d06
add http.* deprecation log (#16538)
- refactor deprecated alias to support obsoleted version
- add deprecation log for http.* config
2024-10-17 14:07:51 +01:00
Andrea Selva
b6f16c8b81
Adds a JMH benchmark to test BufferedTokenizerExt class (#16564)
Adds a JMH benchmark to measure the peformances of BufferedTokenizerExt.
Update also Gradle build script to remove CMS GC flags and fix deprecations for Gradle 9.0.

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
2024-10-16 16:55:52 +02:00
Andrea Selva
85493ce864
Bugfix for BufferedTokenizer to completely consume lines in case of lines bigger then sizeLimit (#16482)
Fixes the behaviour of the tokenizer to be able to work properly when buffer full conditions are met.

Updates BufferedTokenizerExt so that can accumulate token fragments coming from different data segments. When a "buffer full" condition is matched, it record this state in a local field so that on next data segment it can consume all the token fragments till the next token delimiter.
Updated the accumulation variable from RubyArray containing strings to a StringBuilder which contains the head token, plus the remaining token fragments are stored in the input array.
Furthermore it translates the `buftok_spec` tests into JUnit tests.
2024-10-16 16:48:25 +02:00
João Duarte
ab77d36daa
ensure minitar 1.x is used instead of 0.x (#16565) 2024-10-16 12:37:35 +01:00
kaisecheng
cb3b7c01dc
Remove event_api.tags.illegal for v9 (#16461)
This commit removed `--event_api.tags.illegal` option
Fix: #16356
2024-10-15 22:27:36 +01:00
kaisecheng
8cd0fa8767
refactor log for event_api.tags.illegal (#16545)
- add `obsoleted_version` and remove `deprecated_msg` from `deprecated_option` for consistent warning message
2024-10-11 21:22:29 +01:00
Ry Biesemeyer
a931b2cde6
Flow worker utilization probe (#16532)
* flow: refactor pipeline refs to keep worker flows separate

* health: add worker_utilization probe

pipeline is:
  - RED "completely blocked" when last_5_minutes >= 99.999
  - YELLOW "nearly blocked" when last_5_minutes > 95
    - and inludes "recovering" info when last_1_minute < 80
  - YELLOW "completely blocked" when last_1_minute >= 99.999
  - YELLOW "nearly blocked" when last_1_minute > 95

* tests: improve coverage of PipelineIndicator probes

* Apply suggestions from code review
2024-10-10 17:56:22 -07:00
Ry Biesemeyer
065769636b
health: add logstash.forceApiStatus: green escape hatch (#16535) 2024-10-10 16:35:06 -07:00
github-actions[bot]
7f7af057f0
Feature: health report api (#16520) (#16523)
* [health] bootstrap HealthObserver from agent to API (#16141)

* [health] bootstrap HealthObserver from agent to API

* specs: mocked agent needs health observer

* add license headers

* Merge `main` into `feature/health-report-api` (#16397)

* Add GH vault plugin bot to allowed list (#16301)

* regenerate webserver test certificates (#16331)

* correctly handle stack overflow errors during pipeline compilation (#16323)

This commit improves error handling when pipelines that are too big hit the Xss limit and throw a StackOverflowError. Currently the exception is printed outside of the logger, and doesn’t even show if log.format is json, leaving the user to wonder what happened.

A couple of thoughts on the way this is implemented:

* There should be a first barrier to handle pipelines that are too large based on the PipelineIR compilation. The barrier would use the detection of Xss to determine how big a pipeline could be. This however doesn't reduce the need to still handle a StackOverflow if it happens.
* The catching of StackOverflowError could also be done on the WorkerLoop. However I'd suggest that this is unrelated to the Worker initialization itself, it just so happens that compiledPipeline.buildExecution is computed inside the WorkerLoop class for performance reasons. So I'd prefer logging to not come from the existing catch, but from a dedicated catch clause.

Solves #16320

* Doc: Reposition worker-utilization in doc (#16335)

* settings: add support for observing settings after post-process hooks (#16339)

Because logging configuration occurs after loading the `logstash.yml`
settings, deprecation logs from `LogStash::Settings::DeprecatedAlias#set` are
effectively emitted to a null logger and lost.

By re-emitting after the post-process hooks, we can ensure that they make
their way to the deprecation log. This change adds support for any setting
that responds to `Object#observe_post_process` to receive it after all
post-processing hooks have been executed.

Resolves: elastic/logstash#16332

* fix line used to determine ES is up (#16349)

* add retries to snyk buildkite job (#16343)

* Fix 8.13.1 release notes (#16363)

make a note of the fix that went to 8.13.1: #16026

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>

* Update logstash_releases.json (#16347)

* [Bugfix] Resolve the array and char (single | double quote) escaped values of ${ENV} (#16365)

* Properly resolve the values from ENV vars if literal array string provided with ENV var.

* Docker acceptance test for persisting  keys and use actual values in docker container.

* Review suggestion.

Simplify the code by stripping whitespace before `gsub`, no need to check comma and split.

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

---------

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Doc: Add SNMP integration to breaking changes (#16374)

* deprecate java less-than 17 (#16370)

* Exclude substitution refinement on pipelines.yml (#16375)

* Exclude substitution refinement on pipelines.yml (applies on ENV vars and logstash.yml where env2yaml saves vars)

* Safety integration test for pipeline config.string contains ENV .

* Doc: Forwardport 8.15.0 release notes to main (#16388)

* Removing 8.14 from ci/branches.json as we have 8.15. (#16390)

---------

Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Squashed merge from 8.x

* Failure injector plugin implementation. (#16466)

* Test purpose only failure injector integration (filter and output) plugins implementation. Add unit tests and include license notes.

* Fix the degrate method name typo.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* Add explanation to the config params and rebuild plugin gem.

---------

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* Health report integration tests bootstrapper and initial tests implementation (#16467)

* Health Report integration tests bootstrapper and initial slow start scenario implementation.

* Apply suggestions from code review

Renaming expectation check method name.

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* Changed to branch concept, YAML structure simplified as changed to Dict.

* Apply suggestions from code review

Reflect `help_url` to the integration test.

---------

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

* health api: expose `GET /_health_report` with pipelines/*/status probe (#16398)

Adds a `GET /_health_report` endpoint with per-pipeline status probes, and wires the
resulting report status into the other API responses, replacing their hard-coded `green`
with a meaningful status indication.

---------

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* docs: health report API, and diagnosis links (feature-targeted) (#16518)

* docs: health report API, and diagnosis links

* Remove plus-for-passthrough markers

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

---------

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* merge 8.x into feature branch... (#16519)

* Add GH vault plugin bot to allowed list (#16301)

* regenerate webserver test certificates (#16331)

* correctly handle stack overflow errors during pipeline compilation (#16323)

This commit improves error handling when pipelines that are too big hit the Xss limit and throw a StackOverflowError. Currently the exception is printed outside of the logger, and doesn’t even show if log.format is json, leaving the user to wonder what happened.

A couple of thoughts on the way this is implemented:

* There should be a first barrier to handle pipelines that are too large based on the PipelineIR compilation. The barrier would use the detection of Xss to determine how big a pipeline could be. This however doesn't reduce the need to still handle a StackOverflow if it happens.
* The catching of StackOverflowError could also be done on the WorkerLoop. However I'd suggest that this is unrelated to the Worker initialization itself, it just so happens that compiledPipeline.buildExecution is computed inside the WorkerLoop class for performance reasons. So I'd prefer logging to not come from the existing catch, but from a dedicated catch clause.

Solves #16320

* Doc: Reposition worker-utilization in doc (#16335)

* settings: add support for observing settings after post-process hooks (#16339)

Because logging configuration occurs after loading the `logstash.yml`
settings, deprecation logs from `LogStash::Settings::DeprecatedAlias#set` are
effectively emitted to a null logger and lost.

By re-emitting after the post-process hooks, we can ensure that they make
their way to the deprecation log. This change adds support for any setting
that responds to `Object#observe_post_process` to receive it after all
post-processing hooks have been executed.

Resolves: elastic/logstash#16332

* fix line used to determine ES is up (#16349)

* add retries to snyk buildkite job (#16343)

* Fix 8.13.1 release notes (#16363)

make a note of the fix that went to 8.13.1: #16026

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>

* Update logstash_releases.json (#16347)

* [Bugfix] Resolve the array and char (single | double quote) escaped values of ${ENV} (#16365)

* Properly resolve the values from ENV vars if literal array string provided with ENV var.

* Docker acceptance test for persisting  keys and use actual values in docker container.

* Review suggestion.

Simplify the code by stripping whitespace before `gsub`, no need to check comma and split.

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

---------

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Doc: Add SNMP integration to breaking changes (#16374)

* deprecate java less-than 17 (#16370)

* Exclude substitution refinement on pipelines.yml (#16375)

* Exclude substitution refinement on pipelines.yml (applies on ENV vars and logstash.yml where env2yaml saves vars)

* Safety integration test for pipeline config.string contains ENV .

* Doc: Forwardport 8.15.0 release notes to main (#16388)

* Removing 8.14 from ci/branches.json as we have 8.15. (#16390)

* Increase Jruby -Xmx to avoid OOM during zip task in DRA (#16408)

Fix: #16406

* Generate Dataset code with meaningful fields names (#16386)

This PR is intended to help Logstash developers or users that want to better understand the code that's autogenerated to model a pipeline, assigning more meaningful names to the Datasets subclasses' fields.

Updates `FieldDefinition` to receive the name of the field from construction methods, so that it can be used during the code generation phase, instead of the existing incremental `field%n`.
Updates `ClassFields` to propagate the explicit field name down to the `FieldDefinitions`.
Update the `DatasetCompiler` that add fields to `ClassFields` to assign a proper name to generated Dataset's fields.

* Implements safe evaluation of conditional expressions, logging the error without killing the pipeline (#16322)

This PR protects the if statements against expression evaluation errors, cancel the event under processing and log it.
This avoids to crash the pipeline which encounter a runtime error during event condition evaluation, permitting to debug the root cause reporting the offending event and removing from the current processing batch.

Translates the `org.jruby.exceptions.TypeError`, `IllegalArgumentException`, `org.jruby.exceptions.ArgumentError` that could happen during `EventCodition` evaluation into a custom `ConditionalEvaluationError` which bubbles up on AST tree nodes. It's catched in the `SplitDataset` node.
Updates the generation of the `SplitDataset `so that the execution of `filterEvents` method inside the compute body is try-catch guarded and defer the execution to an instance of `AbstractPipelineExt.ConditionalEvaluationListener` to handle such error. In this particular case the error management consist in just logging the offending Event.

---------

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>

* Update logstash_releases.json (#16426)

* Release notes for 8.15.1 (#16405) (#16427)

* Update release notes for 8.15.1

* update release note

---------

Co-authored-by: logstashmachine <43502315+logstashmachine@users.noreply.github.com>
Co-authored-by: Kaise Cheng <kaise.cheng@elastic.co>
(cherry picked from commit 2fca7e39e8)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fix ConditionalEvaluationError to do not include the event that errored in its serialiaxed form, because it's not expected that this class is ever serialized. (#16429) (#16430)

Make inner field of ConditionalEvaluationError transient to be avoided during serialization.

(cherry picked from commit bb7ecc203f)

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* use gnu tar compatible minitar to generate tar artifact (#16432) (#16434)

Using VERSION_QUALIFIER when building the tarball distribution will fail since Ruby's TarWriter implements the older POSIX88 version of tar and paths will be longer than 100 characters.

For the long paths being used in Logstash's plugins, mainly due to nested folders from jar-dependencies, we need the tarball to follow either the 2001 ustar format or gnu tar, which is implemented by the minitar gem.

(cherry picked from commit 69f0fa54ca)

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* account for the 8.x in DRA publishing task (#16436) (#16440)

the current DRA publishing task computes the branch from the version
contained in the version.yml

This is done by taking the major.minor and confirming that a branch
exists with that name.

However this pattern won't be applicable for 8.x, as that branch
currently points to 8.16.0 and there is no 8.16 branch.

This commit falls back to reading the buildkite injected
BUILDKITE_BRANCH variable.

(cherry picked from commit 17dba9f829)

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Fixes the issue where LS wipes out all quotes from docker env variables. (#16456) (#16459)

* Fixes the issue where LS wipes out all quotes from docker env variables. This is an issue when running LS on docker with CONFIG_STRING, needs to keep quotes with env variable.

* Add a docker acceptance integration test.

(cherry picked from commit 7c64c7394b)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Known issue for 8.15.1 related to env vars references (#16455) (#16469)

(cherry picked from commit b54caf3fd8)

Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>

* bump .ruby_version to jruby-9.4.8.0 (#16477) (#16480)

(cherry picked from commit 51cca7320e)

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

* Release notes for 8.15.2 (#16471) (#16478)

Co-authored-by: andsel <selva.andre@gmail.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
(cherry picked from commit 01dc76f3b5)

* Change LogStash::Util::SubstitutionVariables#replace_placeholders refine argument to optional (#16485) (#16488)

(cherry picked from commit 8368c00367)

Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>

* Use jruby-9.4.8.0 in exhaustive CIs. (#16489) (#16491)

(cherry picked from commit fd1de39005)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Don't use an older JRuby with oraclelinux-7 (#16499) (#16501)

A recent PR (elastic/ci-agent-images/pull/932) modernized the VM images
and removed JRuby 9.4.5.0 and some older versions.

This ended up breaking exhaustive test on Oracle Linux 7 that hard coded
JRuby 9.4.5.0.

PR https://github.com/elastic/logstash/pull/16489 worked around the
problem by pinning to the new JRuby, but actually we don't
need the conditional anymore since the original issue
https://github.com/jruby/jruby/issues/7579#issuecomment-1425885324 has
been resolved and none of our releasable branches (apart from 7.17 which
uses `9.2.20.1`) specify `9.3.x.y` in `/.ruby-version`.

Therefore, this commit removes conditional setting of JRuby for
OracleLinux 7 agents in exhaustive tests (and relies on whatever
`/.ruby-version` defines).

(cherry picked from commit 07c01f8231)

Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>

* Improve pipeline bootstrap error logs (#16495) (#16504)

This PR adds the cause errors details on the pipeline converge state error logs

(cherry picked from commit e84fb458ce)

Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>

* Logstash Health Report Tests Buildkite pipeline setup. (#16416) (#16511)

(cherry picked from commit 5195332bc6)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Make health report test runner script executable. (#16446) (#16512)

(cherry picked from commit 2ebf2658ff)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* Backport PR #16423 to 8.x: DLQ-ing events that trigger an conditional evaluation error. (#16493)

* DLQ-ing events that trigger an conditional evaluation error. (#16423)

When a conditional evaluation encounter an error in the expression the event that triggered the issue is sent to pipeline's DLQ, if enabled for the executing pipeline.

This PR engage with the work done in #16322, the `ConditionalEvaluationListener` that is receives notifications about if-statements evaluation failure, is improved to also send the event to DLQ (if enabled in the pipeline) and not just logging it.

(cherry picked from commit b69d993d71)

* Fixed warning about non serializable field DeadLetterQueueWriter in serializable AbstractPipelineExt

---------

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

* add deprecation log for `--event_api.tags.illegal` (#16507) (#16515)

- move `--event_api.tags.illegal` from option to deprecated_option
- add deprecation log when the flag is explicitly used
relates: #16356

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
(cherry picked from commit a4eddb8a2a)

Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>

---------

Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>
Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>

---------

Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co>
Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com>
Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>
(cherry picked from commit 7eb5185b4e)

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
2024-10-10 11:17:14 -07:00
kaisecheng
a4eddb8a2a
add deprecation log for --event_api.tags.illegal (#16507)
- move `--event_api.tags.illegal` from option to deprecated_option
- add deprecation log when the flag is explicitly used
relates: #16356

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
2024-10-08 13:49:59 +01:00
Andrea Selva
5d4825f000
Avoid to access Java DeprecatedAlias value other than Ruby's one (#16506)
Update Settings to_hash method to also skip Java DeprecatedAlias and not just the Ruby one.
With PR #15679 was introduced org.logstash.settings.DeprecatedAlias which mirrors the behaviour of Ruby class Setting::DeprecatedAlias. The equality check at Logstash::Settings, as descibed in #16505 (comment), is implemented comparing the maps.
The conversion of Settings to the corresponding maps filtered out the Ruby implementation of DeprecatedAlias but not the Java one.
This PR adds also the Java one to the list of filter.
2024-10-04 16:46:00 +02:00
Edmo Vamerlatti Costa
e84fb458ce
Improve pipeline bootstrap error logs (#16495)
This PR adds the cause errors details on the pipeline converge state error logs
2024-10-03 11:08:42 +02:00
Andrea Selva
4e49adc6f3
Fix jdk21 warnings (#16496)
Suppress some warnings compared with JDK 21

- this-escape uses this before it is completely initialised.
- avoid a non serialisable DeadLetterQueueWriter field from serialisable instance.
2024-10-02 15:28:37 +02:00
Andrea Selva
b69d993d71
DLQ-ing events that trigger an conditional evaluation error. (#16423)
When a conditional evaluation encounter an error in the expression the event that triggered the issue is sent to pipeline's DLQ, if enabled for the executing pipeline.

This PR engage with the work done in #16322, the `ConditionalEvaluationListener` that is receives notifications about if-statements evaluation failure, is improved to also send the event to DLQ (if enabled in the pipeline) and not just logging it.
2024-10-02 12:23:54 +02:00
Andrea Selva
61de60fe26
[Spacetime] Reimplement config Setting classe in java (#15679)
Reimplement the root Ruby Setting class in Java and use it from the Ruby one moving the original Ruby class to a shell wrapping the Java instance.
In particular create a new symmetric hierarchy (at the time just for `Setting`, `Coercible` and `Boolean` classes) to the Ruby one, moving also the feature for setting deprecation. In this way the new `org.logstash.settings.Boolean` is syntactically and semantically equivalent to the old Ruby Boolean class, which replaces.
2024-10-02 09:09:47 +02:00
Edmo Vamerlatti Costa
8368c00367
Change LogStash::Util::SubstitutionVariables#replace_placeholders refine argument to optional (#16485) 2024-10-01 11:33:35 -07:00
Mashhur
7c64c7394b
Fixes the issue where LS wipes out all quotes from docker env variables. (#16456)
* Fixes the issue where LS wipes out all quotes from docker env variables. This is an issue when running LS on docker with CONFIG_STRING, needs to keep quotes with env variable.

* Add a docker acceptance integration test.
2024-09-17 06:46:19 -07:00
Andrea Selva
1ec37b7c41
Drop JDK 11 support (#16443)
If a user runs Logstash with a hosted JDK and not the one bundled with Logstash distribution, like setting a specific LS_JAVA_HOME, which is minor than JDK 17 then Logstash refuses to start. Has to provide at least a JDK 17 or unset the LS_JAVA_HOME and let Logstash uses the bundled JDK.

Updates the jvm.options and JvmOptionsParser to remove support for JDK 11. If the options parser identifies that the running JVM is less than 17, it refuses to start.

---------

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
2024-09-13 17:33:16 +02:00
Andrea Selva
bb7ecc203f
Fix ConditionalEvaluationError to do not include the event that errored in its serialiaxed form, because it's not expected that this class is ever serialized. (#16429)
Make inner field of ConditionalEvaluationError transient to be avoided during serialization.
2024-09-06 12:09:58 +02:00
Andrea Selva
b88e23702c
Implements safe evaluation of conditional expressions, logging the error without killing the pipeline (#16322)
This PR protects the if statements against expression evaluation errors, cancel the event under processing and log it.
This avoids to crash the pipeline which encounter a runtime error during event condition evaluation, permitting to debug the root cause reporting the offending event and removing from the current processing batch.

Translates the `org.jruby.exceptions.TypeError`, `IllegalArgumentException`, `org.jruby.exceptions.ArgumentError` that could happen during `EventCodition` evaluation into a custom `ConditionalEvaluationError` which bubbles up on AST tree nodes. It's catched in the `SplitDataset` node.
Updates the generation of the `SplitDataset `so that the execution of `filterEvents` method inside the compute body is try-catch guarded and defer the execution to an instance of `AbstractPipelineExt.ConditionalEvaluationListener` to handle such error. In this particular case the error management consist in just logging the offending Event.


---------

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
2024-09-05 10:57:10 +02:00
Andrea Selva
ac034a14ee
Generate Dataset code with meaningful fields names (#16386)
This PR is intended to help Logstash developers or users that want to better understand the code that's autogenerated to model a pipeline, assigning more meaningful names to the Datasets subclasses' fields.

Updates `FieldDefinition` to receive the name of the field from construction methods, so that it can be used during the code generation phase, instead of the existing incremental `field%n`.
Updates `ClassFields` to propagate the explicit field name down to the `FieldDefinitions`.
Update the `DatasetCompiler` that add fields to `ClassFields` to assign a proper name to generated Dataset's fields.
2024-09-04 11:10:29 +02:00
Mashhur
e104704830
Exclude substitution refinement on pipelines.yml (#16375)
* Exclude substitution refinement on pipelines.yml (applies on ENV vars and logstash.yml where env2yaml saves vars)

* Safety integration test for pipeline config.string contains ENV .
2024-08-09 09:33:01 -07:00
Ry Biesemeyer
3d13ebe33e
deprecate java less-than 17 (#16370) 2024-08-09 08:58:11 +01:00
Mashhur
62ef8a0847
[Bugfix] Resolve the array and char (single | double quote) escaped values of ${ENV} (#16365)
* Properly resolve the values from ENV vars if literal array string provided with ENV var.

* Docker acceptance test for persisting  keys and use actual values in docker container.

* Review suggestion.

Simplify the code by stripping whitespace before `gsub`, no need to check comma and split.

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>

---------

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
2024-08-06 11:09:26 -07:00
Ry Biesemeyer
c633ad2568
settings: add support for observing settings after post-process hooks (#16339)
Because logging configuration occurs after loading the `logstash.yml`
settings, deprecation logs from `LogStash::Settings::DeprecatedAlias#set` are
effectively emitted to a null logger and lost.

By re-emitting after the post-process hooks, we can ensure that they make
their way to the deprecation log. This change adds support for any setting
that responds to `Object#observe_post_process` to receive it after all
post-processing hooks have been executed.

Resolves: elastic/logstash#16332
2024-07-24 10:22:34 +01:00
João Duarte
8f2dae618c
correctly handle stack overflow errors during pipeline compilation (#16323)
This commit improves error handling when pipelines that are too big hit the Xss limit and throw a StackOverflowError. Currently the exception is printed outside of the logger, and doesn’t even show if log.format is json, leaving the user to wonder what happened.

A couple of thoughts on the way this is implemented:

* There should be a first barrier to handle pipelines that are too large based on the PipelineIR compilation. The barrier would use the detection of Xss to determine how big a pipeline could be. This however doesn't reduce the need to still handle a StackOverflow if it happens.
* The catching of StackOverflowError could also be done on the WorkerLoop. However I'd suggest that this is unrelated to the Worker initialization itself, it just so happens that compiledPipeline.buildExecution is computed inside the WorkerLoop class for performance reasons. So I'd prefer logging to not come from the existing catch, but from a dedicated catch clause.

Solves #16320
2024-07-18 10:08:38 +01:00
Ry Biesemeyer
66aeeeef83
Json normalization performance (#16313)
* licenses: allow elv2, standard abbreviation for Elastic License version 2

* json-dump: reduce unicode normalization cost

Since the underlying JrJackson now properly (and efficiently) encodes the
UTF-8 transcode of whichever strings it is given, we no longer need to
pre-normalize to UTF-8 in ruby _except_ when the string is flagged as BINARY
because we have alternate behaviour to preserve valid UTF-8 sequences.

By emitting a _copy_ of binary-flagged strings that have been re-flagged as
UTF-8, we allow the downstream (efficient) encoding operation in jrjackson
to produce equivalent behaviour at much lower cost.

* cleanup: remove orphan unicode normalizer
2024-07-09 14:12:21 -07:00
João Duarte
121b1c9632
update jruby to 9.4.8.0 (#16278)
https://www.jruby.org/2024/07/02/jruby-9-4-8-0.html

> Fixed a bug in the bytecode JIT causing patterns to execute incorrect branches. #8283, #8284
> jruby-openssl is updated to 0.15.0, with updated Bouncy Castle libraries to avoid CVEs in older versions.
> uri is updated to 0.12.2, mitigating CVE
> net-ftp is updated to 0.3.7 with restored functionality on JRuby.

Exhaustive test suite: https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/580
2024-07-02 19:57:55 +01:00
Edmo Vamerlatti Costa
784fa186c8
Ensure pipeline metrics are cleared on the pipeline shutdown (#16264)
This commit fixed the configuration reload process to clean up the pipeline's metric store, so it does not retain references to failed pipelines components.
2024-06-28 13:13:39 +02:00
Mashhur
0cfe6b0801
Add RubyEvent#dup support and unit test case to keep Json#dump(Event) safe. (#16255)
* Add RubyEvent#dup support and unit test case to keep Json#dump(Event) safe.


Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>

---------

Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
2024-06-27 13:08:56 -07:00
Ry Biesemeyer
0ec16ca398
Unicode pipeline and plugin ids (#15971)
* fix: restore support for unicode pipeline- and plugin-id's

JRuby's `Ruby#newSymbol(String)` throws an exception when provided a `String`
that contains characters outside of lower-ASCII because JRuby internals expect
"the incoming String to be one of our mangled ISO-8859-1 strings" as noted in
a comment on jruby/jruby#6217.

Instead, we use `Ruby#newString(String)` to create a new `RubyString` (which
works properly), and then rely on `RubyString#intern` to get our `RubySymbol`.

This fixes a regression introduced in the 8.7 series in which pipeline id's
are consistently represented as ruby symbols in the metrics store, and ensures
similar issue does not exist when specifying a plugin id that contains
characters above the lower-ASCII plane.

* fix: use properly-encoded RubySymbol in PipelineConfig

We cannot rely on `RubySymbol#toString` to produce a properly-encoded `String`
whe the string contains characters above the lower-ASCII plane because the
result is effectively a binary ruby-internal marshal of the bytes that only
holds when the symbol contains lower-ASCII.

Instead, we can use the internally-memoizing `RubySymbol#name` to get a
properly-encoded `RubyString`, and `RubyString#asJavaString()` to get a
properly-encoded java-`String`.

* fix: properly serialize unicode pipeline names in API output

Jackson's JSON serializer leaks the JRuby-internal byte structure of Symbols,
which only aligns with the byte-structure of the symbol's actual string when
that string is wholly-comprised of lower-ASCII characters.

By pre-converting Symbols to Strings, we ensure that the result is readable
and useful.

* spec: bypass monitoring specs for unicode pipeline ids when PQ enabled
2024-06-25 08:35:28 -07:00
Ry Biesemeyer
92909cb1c4
json: remove unnecessary dup/freeze in serialization (#16213) 2024-06-20 09:15:49 -07:00
Andrea Selva
321e407e53
Avoid to log file not found errors when DLQ segments are removed concurrently between writer and reader. (#16204)
* Rework the logic to delete DLQ eldest segments to be more resilient on file not found errors and avoid to log warn messages that there isn't any action the user can do to solve.

* Fixed test case, when path point to a file that doesn't exist, rely always on path name comparator. Reworked the code to simplify, not needing anymore the tri-state variable
2024-06-20 08:52:19 -07:00
Andrea Selva
ed930f820d
Avoid mocking the value returned in global SETTINGS constant. (#16245)
This a refactoring of test fixture.
Avoid mocking the value returned in global SETTINGS constant. Use instead the local setting map instance used in subject creation.
2024-06-20 14:25:53 +02:00
Ry Biesemeyer
0f6fa5c8fb
p2p: adds opt-in pipeline bus with less synchronization (#16194)
* p2p: extract interface from v1 pipeline bus

* p2p: extract pipeline push to abstract

* p2p: add opt-in unblocked "v2" implementation

Adds a v2 implementation that does not synchronize on the sender so that
multiple workers can send events through a common `pipeline` output instance
simultaneously.

In this implementation, an `AddressStateMapping` provides synchronized
mutation and cleanup of the underlying `AddressState`, and allows only
queryable mutable views (`AddressState.ReadOnly`) to escape encapsulation.

The implementation also holds indentity-keyed mapping from `PipelineOutput`s
to the set of `AddressState.ReadOnly`s it is regested as a sender for so
that they can be quickly resolved at runtime.

* p2p: more tests for pipeline restart behaviour

* p2p: make v2 pipeline bus the default
2024-06-17 07:35:54 -07:00
Andrea Selva
fab345881a
Introduce filesystem signalling from DLQ read to writer to update byte size metric accordingly when the reader uses clean_consumed (#16195)
Updates the DLQ reader to create a notification file (`.deleted_segment`) which signal when a segment is deleted in consequence of `clean_consumed` set. Updates the DLQ writer to have a filesystem watch so that can receive the reader's signal and update the exposed metric,  loading the size by listing FS segments occupation.
2024-06-17 14:27:39 +02:00
Andrea Selva
efa83787a5
Revert PR #16050
The PR was created to skip resolving environment variable references in comments present in the “config.string” pipelines defined in the pipelines.yml file.
However it introduced a bug that no longer resolves env var references in values of settings like pipeline.batch.size or queue.max_bytes.
For now we’ll revert this PR and create a fix that handles both problems.
2024-06-06 20:24:45 +01:00
Ry Biesemeyer
ea930861ef
PQ: avoid blocking writer when precisely full (#16176)
* pq: avoid blocking writer when queue is precisely full

A PQ is considered full (and therefore needs to block before releasing the
writer) when its persisted size on disk _exceeds_ its `queue.max_bytes`
capacity.

This removes an edge-case preemptive block when the persisted size after
writing an event _meets_ its `queue.max_bytes` precisely AND its current
head page has insufficient room to also accept a hypothetical future event.

Fixes: elastic/logstash#16172

* docs: PQ `queue.max_bytes` cannot be less than `queue.page_capacity`
2024-05-22 08:23:18 -07:00