Commit graph

1654 commits

Author SHA1 Message Date
Andrea Selva
05bfaff799
Avoid the wrapping of LogstashMessageFactory with log4j's MessageFactory2Adapter (#14727)
Starting with Log4j2  2.6 if a subclass of MessageFactory associated with an Logger instance
is not subclass of MessageFactory2, then it's wrapped with MessageFactory2Adapter.

This trigger a log4j warn log that, when a class subclasses LogStash::Plugin for example, is noisy and report about
a Logger is not associated with the default MessagedFactory (LogstashMessageFactory) every time a subclass of Plugins is instantiated.

This commit adapt LogstashMessageFactory to implement the MessagedFactory2 instead of the older MessageFactory to avoid the wrapping with the adapter class.
2022-11-09 10:16:34 +01:00
Ry Biesemeyer
372a61219f
Fix pipelines yaml loading (#14713)
* source/multilocal: fix detection of empty pipelines.yml

Fixes a regression introduced in elastic/logstash#13883 in which the presence
of an empty `pipelines.yml` file produces an error message indicating that
the file cannot be read.

When either `YAML::load` or `YAML::safe_load` encounter an effectively-empty
payload (such as one that is entirely comments), they use a `fallback` param
to determine what value to emit, with the former emitting `false` and the
latter emitting `nil`.

This is problematic because a _separate_ blind-`rescue nil` causes `nil` to
be bound to the MultiLocal's `@detected_marker`, and we assume that a `nil`
value in the marker means that there was an exception reading the file (such
as a permissions issue or parse failure).

By providing a `fallback: false` directive when parsing the contents, we
ensure that an empty file is reported as such.

* source/multilocal: avoid `rescue nil` that loses helpful context

When the pipelines yaml cannot be read, or can be read but fails to parse,
the MultiLocal#read_pipelines_from_yaml emits a helpful exception including
specifics about why it failed to load or parse, but a blind `rescue nil`
here causes that helpful information to be lost.

When pipeline detection is exceptional, hold onto the helpful exception
so that it can be reported along with the config conflicts.

* source/multilocal: differentiate between reading and parsing failure

* source/multilocal: use translations for conflict messages

* source/multilocal: specs for error conditions
2022-11-02 11:05:01 -07:00
Mashhur
f19e9cb647
Collect queue growth events and bytes metrics when PQ is enabled. (#14554)
* Collect growth events and bytes metrics if PQ is enabled: Java changes.

* Move queue flow under queue namespace.

* Pipeline level PQ flow metrics: add unit & integration tests.

* Include queue info in node stats sample.

* Apply suggestions from code review

Change uptime precision for PQ growth metrics to uptime seconds since PQ events are based on seconds.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Add safeguard when using lazy delegating gauge type.

* flow metrics: simplify generics of lazy implementation

Enables interface `FlowMetrics::create` to take suppliers that _implement_
a `Metric<? extends Number>` instead of requiring them to be pre-cast, and
avoid unnecessary exposure of the metrics value-type into our lazy init.

* flow metrics: use lazy init for PQ gauge-based metrics

* noop: use enum equality

Avoids routing two enum values through `MetricType#toString()`
and `String#equals()` when they can be compared directly.

* Apply suggestions from code review

Optional.ofNullable used for safe return. Doc includes real tested expected metric values.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* flow metrics: make lazy-init wraper inherit from AbstractMetric

this allows the Jackson serialization annotations to work

* flow metrics: move pipeline queue-based flows into pipeline flow namespace

* Follow up for moving PQ growth metrics under pipeline.*.flow.
- Unit and integration tests are added or fixed.
- Documentation added along with sample response data

* flow: pipeline pq flow rates docs

* Do not expect flow in the queue section of API. Metrics moved to flow section.

Update logstash-core/spec/logstash/api/commands/stats_spec.rb

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Integration test failure fix.

Mistake: `flow_status` should be `pipeline_flow_stats`

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Integration test failures fix.

Number should be Numeric in the ruby specs.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Make CI happy.

* api specs: use PQ only where needed

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
2022-10-13 15:30:31 -07:00
João Duarte
00a7ae8a75
fix PipelineIR.getPostQueue by accounting for vertex copies (#13621)
During graph composition vertices may be copied. This caused
getPostQueue to malfunction as the QueueVertex object stored in the
PipelineIR isn't the one present in the graph once it's fully generated.

This object mismatch caused Graph.getSortedVerticesBetween to not find
the QueueVertex since it takes Objects instead of ids.

This commit waits for the graph to be built and then retrieves the
QueueVertex from the graph and sets it in PipelineIR.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
2022-10-12 16:19:05 +01:00
Ry Biesemeyer
bab2e1c03e
timestamp: respect locale's decimal-style when parsing (#14628)
Uses the locale-defined decimal style first.

When encountering a failure and the locale-defined decimal style is NOT
the "standard" decimal style, retry the parse operation with the "standard"
decimal style.
2022-10-11 15:29:56 -07:00
Ry Biesemeyer
de49eba22a
api: source pipelines that are fully-loaded (#14595)
* specs: detangle out-of-band pipeline initialization

Our API tests were initializing their pipelines-to-test in an out-of-band
manner that prevented the agent from having complete knowledge of the
pipelines that were running. By providing a ConfigSource to our Agent's
SourceLoader, we can rely on the normal pipeline reload behaviour to ensure
that the agent fully-manages the pipelines in question.

* api: do not emit pipeline that is not fully-initialized
2022-10-11 08:14:00 -07:00
Andrea Selva
d07eb01e23
Adds new close method to Java's Filter API to be used to clean shutdown resources allocated by the filter during registration phase. (#14485)
- Adds a new method to the public API interface
- Pass the call through the JavaFilterDelegatorExt

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
2022-10-10 17:35:29 +02:00
kaisecheng
8a8a036896
Fix DLQ fails to start due to read 1 byte file (#14605)
This commit ignores DLQ files that contain only the version number. These files have no content and should be skipped.

Fixed: #14599
2022-10-10 11:07:13 +01:00
Ry Biesemeyer
46babd6041
Extended Flow Metrics (#14571)
* flow metrics: extract to interface, sharable-comon base, and implementation

In preparation of landing an additional implementation of FlowMetric, we
shuffle the current parts net-unchanged to provide interfaces for `FlowMetric`
and `FlowCapture`, along with a sharable-common `BaseFlowMetric`, and move
our initial implementation to a new `SimpleFlowMetric`, accessible only
through a static factory method on our new `FlowMetric` interface.

* flow-rates: refactor LIFETIME up to sharable base

* util: add SetOnceReference

* flow metrics: tolerate unavailable captures

While the metrics we capture from in the initial release of FlowMetrics
are all backed by `Metric<T extends Number>` whose values are non-null,
we will need to capture from nullable `Gauge<Number>` in order to
support persistent queue size and capacity metrics. This refactor uses
the newly-introduced `SetOnceReference` to defer our baseline lifetime
capture until one is available, and ensures `BaseFlowMetric#doCapture`
creates a capture if-and-only-if non-null values are available from
the provided metrics.

* flow rates: limit precision for readability

* flow metrics: introduce policy-driven extended windows implementation

The new ExtendedFlowMetric is an alternate implementation of the FlowMetric
introduced in Logstash 8.5.0 that is capable of producing windoes for a set of
policies, which dictate the desired retention for the rate along with a
desired resolution.

 - `current`: 10s retention, 1s resolution [*]
 - `last_1_minute`: one minute retention, at 3s resolution [*]
 - `last_5_minutes`: five minutes retention, at 15s resolution
 - `last_15_minutes`: fifteen minutes retention, at 30s resolution
 - `last_1_hour`: one hour retention, at 60s resolution
 - `last_24_hours`: one day retention at 15 minute resolution

A given series may report a range for slightly longer than its configured
retention period, up to the either the series' configured resolution or
our capture rate (currently ~5s), whichever is greater. This approach
allows us to retain sufficient data-points to present meaningful rolling
averages while ensuring that our memory footprint is bounded.

When recording these captures, we first stage the newest capture, and then
promote the previously-staged caputure to the tail of a linked list IFF
the gap between our new capture and the newest promoted capture is larger
than our desired resolution.

When _reading_ these rates, we compact the head of that linked list forward
in time as far as possible without crossing the desired retention barrier,
at which point the head points to the youngest record that is old enough
to satisfy the period for the series.

We also occesionally compact the head during writes, but only if the head
is significantly out-of-date relative to the allowed retention.

As implemented here, this extended flow rates are on by default, but can be
disabled by setting the JVM system property `-Dlogstash.flowMetric=simple`

* flow metrics: provide lazy-initiazed implementation

* flow metrics: append lifetime baseline if available during init

* flow metric tests: continuously monitor combined capture count

* collection of unrelated minor code-review fixes

* collection of even more unrelated minor code-review fixes
2022-10-06 18:35:33 -07:00
Ry Biesemeyer
228030c494
Simplify Pipeline class Hierarchy (#14551)
* refactor: pull members up from JavaBasePipelineExt to AbstractPipelineExt

* refactor: make `LogStash::JavaPipeline` inherit directly from `AbstractPipeline`
2022-09-26 18:16:20 -07:00
Ry Biesemeyer
6e0b365c92
Feature: flow metrics integration (#14518)
* Flow metrics: initial implementation (#14509)

* metrics: eliminate race condition when registering metrics

Ensure our fast-lookup and store tables cannot diverge in a race condition
by wrapping mutation of both in a single mutex and appropriately handle
another thread winning the race to the lock by using the value that it
persisted instead of writing our own.

* metrics: guard against intermediate namespace conflicts

 - ensures our safeguard that prevents using an existing metric as a namespace
   is applied to _intermediate_ nodes, not just the tail-node, eliminating a
   potential crash when sending `fetch_or_store` to a metric object that is not
   expected to respond to `fetch_or_store`.
 - uses the atomic `Concurrent::Map#compute_if_absent` instead of the
   non-atomic `Concurrent::Map#fetch_or_store`, which is prone to
   last-write-wins during contention (as-written, this method is only
   executed under lock and not subject to contention)
 - uses `Enumerable#reduce` to eliminate the need for recursion

* flow: introduce auto-advancing UptimeMetric

* flow: introduce FlowMetric with minimal current/lifetime rates

* flow: initialize pipeline metrics at pipeline start

* Controller and service layer implementation for flow metrics. (#14514)

* Controller and service layer implementation for flow metrics.

* Add flow metrics to unit test and benchmark cli definitions.

* flow: fix tests for metric types to accomodate new one

* Renaming concurrency and backpressure metrics.

Rename `concurrency` to `worker_concurrency ` and `backpressure` to `queue_backpressure` to provide proper scope naming.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* metric: register flow metrics only when we have a collector (#14529)

the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.

* Unit tests and integration tests added for flow metrics. (#14527)

* Unit tests and integration tests added for flow metrics.

* Node stat spec and pipeline spec metric updates.

* Metric keys statically imported, implicit error expectation added in metric spec.

* Fix node status API spec after renaming flow metrics.

* Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates.

* metric: register flow metrics only when we have a collector (#14529)

the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.

* Unit tests and integration tests added for flow metrics.

* Node stat spec and pipeline spec metric updates.

* Metric keys statically imported, implicit error expectation added in metric spec.

* Fix node status API spec after renaming flow metrics.

* Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates.

* Rebasing with feature branch.

* metric: register flow metrics only when we have a collector

the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.

* Apply suggestions from code review

Integration tests updated to test capturing the flow metrics.

* Flow metrics expectation updated in tegration tests.

* flow: refine integration expectations for reloads/monitoring

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
Co-authored-by: Mashhur <mashhur.sattorov@gmail.com>

* metric: add ScaledView with sub-unit precision to UptimeMetric (#14525)

* metric: add ScaledView with sub-unit precision to UptimeMetric

By presenting a _view_ of our metric that maintains sub-unit precision,
we prevent jitter that can be caused by our periodic poller not running at
exactly our configured cadence.

This is especially important as the UptimeMetric is used as the _denominator_ of
several flow metrics, and a capture at 4.999s that truncates to 4s, causes the
rate to be over-reported by ~25%.

The `UptimeMetric.ScaledView` implements `Metric<Number>`, so its full
lossless `BigDecimal` value is accessible to our `FlowMetric` at query time.

* metrics: reduce window for too-frequent-captures bug and document it

* fixup: provide mocked clock to flow metric

* Flow metrics cleanup (#14535)

* flow metrics: code-style and readability pass

* remove unused imports

* cleanup: simplify usage of internal helpers

* flow: migrate internals to use OptionalDouble

* Flow metrics global (#14539)

* flow: add global top-level flows

* docs: add flow metrics

* Top level flow metrics unit tests added. (#14540)

* Top level flow metrics unit tests added.

* Add unit tests when config reloads, make sure top-level flow metrics didn't get reset.

* Apply suggestions from code review

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* Validating against Hash test cases updated.

* For the safety check against exact type in unit tests.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>

* docs: section links and clarity in node stats API flow metrics

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Mashhur <mashhur.sattorov@gmail.com>
2022-09-19 14:21:45 -07:00
João Duarte
4584b632fd
drop support for all DES ciphers in OpenSSL and simplify cipher list (#14499)
Remove support for DES-CBC3-SHA.

Also removes unnecessary exclusions for EDH-DSS-DES-CBC3-SHA, EDH-RSA-DES-CBC3-SHA and KRB5-DES-CBC3-SHA since there's already a "!DES" rule.
2022-09-07 13:12:27 +01:00
kaisecheng
3a78621109
fix SettingsImpl to remove hardcoded checkpointRetry (#14487)
SettingsImpl.checkpointRetry is hardcoded to false in builder. Prior to this change, users are unable to set queue.checkpoint.retry to true to enable Windows retry on PQ AccessDeniedException in checkpoint.

Fixed: #14486
2022-09-05 16:22:08 +01:00
João Duarte
9d16c3bce5
Stop pinning mustermann in logstash-core.gemspec (#14453) 2022-08-17 11:37:14 +01:00
João Duarte
24c675ccd8
remove sinatra constraint (#14446) 2022-08-17 10:04:16 +01:00
João Duarte
f377fd3e4f
update some java dependencies (#14377)
Update the following java dependencies:

* org.reflections:reflections
* commons-codec:commons-codec
* com.google.guava:guava
* com.google.googlejavaformat:google-java-format
* org.javassist:javassist

The goal of these updates is to not fall behind and avoid surprises when an upgrade is necessary due to a security issue.
2022-07-26 11:34:35 +01:00
Rob Bavey
cfbded232c
Clean up java plugin threadsafe/concurrency check (#14360)
Prior to this commit, the java filter delegator would return "java" to the
`#threadsafe?` call when determining whether a filter is considered thread safe.
This is correct for the _outputs_, where this prefix is used to construct the
appropriate delegator object, but not for filters, where a simple true/false is
required. This commit replaces the `getCurrency` call with a `isThreadsafe` call
to make it clearer that a boolean call is required.
2022-07-20 09:00:14 -04:00
Rob Bavey
88c3f95ffc
jruby-9.3 test fix for windows CI (#14331)
This test has been broken on Windows since the jruby 9.3 update, and is
similar to an issue we saw when referencing other classes in the
org.logstash.util namespace
2022-07-19 11:41:14 -04:00
João Duarte
90872fb6ff
ArcSight Module Broken (Invalid Type), Fixed (#13874)
The Module is broken with the current version. The Type needs to be changed from syslog to _doc to fix the issue.

* remove dangling setting and add arcsight index suffixes
* add tests for new suffix in arcsight module

Co-authored-by: Tobias Schröer <tobias@schroeer.ch>
2022-07-18 16:56:24 +01:00
Andrea Selva
39f39658a1
Create wrapper class to bridge the calls to Ruby codec and present it self as a Java codec (#13523)
Open the ability to use Ruby codec inside Java plugins.
Java plugins need subclasses of Java `co.elastic.logstash.api.Codec` class to properly work. This PR implements an adapter for Ruby codecs to be wrapped into a Java's Codec subclass.

Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
2022-07-13 11:44:12 +02:00
Andrea Selva
394edbbd71
Expose DLQ deleted events count on reader side (#14336)
Increments the counters of consumed events and consumed events every time a segment is completely read from the reader side.
Publish the collected data to the input plugin through the newly added method `segmentsDeleted(numberOfSegments, numberOfEvents` to `SegmentListener`.
2022-07-13 09:13:51 +02:00
Andrea Selva
cfdd5d521d
Bugfix for DLQ when writer is pushing data and reader consuming. (#14333)
Bugfix for DLQ when writer and reader with clean_consumed feature are working on same queue.

The writer side keeps the size of DLQ in memory variable and when exceed it applies the storage policy, so essentially drop older or never events.
If on the opposite side of the DLQ there is a read, with clean_consumed feature enabled, that delete the segments it processes, then the writer side could have a not consistent information about the DLQ size.

This commit adds a filesystem check when it recognizes that "DLQ full" condition is reached, to avoid consider false positive as valid data.

Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
2022-07-11 09:57:54 +02:00
Andrea Selva
34588791fa
Added metric to count removed events when age retention kicks in (#14324)
Adds the metric measure `expired_events` to node stats graph to track the number of events removed by age retention pruning policy.
In the _node/stats API, under the path `pipelines.<pipeline upstream>.dead_letter_queue` enrich the document with the new metric.
The number of expired events is incremented every time a segment file is deleted; this happens opening the file and counting the `complete` and `start` records.
2022-07-08 14:39:01 +02:00
Andrea Selva
be87b0b878
Implement DLQ age retention policy (#14255)
Updates DLQ writer's writeEvent method to clean the tail segments  older then the duration period. This happens only if setting dead_letter_queue.retain.age is configured.
To read the age of a segment it extract the timestamp of the last (youngest) message in the segment.

The age is defined as a number followed by one of d, h, m, s that stands for days, hours, minutes and seconds. If nothing is used then assumes seconds as default measure entity.

Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
2022-06-30 18:09:15 +02:00
João Duarte
601c45f49d
allow any class in CBOR deserialization (#14312)
This should be followed by introducing a tighter validator
but the current one introduced in 8.3.0 was creating issues for users.
2022-06-29 10:16:45 +01:00
Andrea Selva
3b218a3ce7
Adds DLQ reader's lock to force single reader access to segments when clean_consumed feature is enabled (#14256)
Uses the FileLock mechanism, already used in PQ, to force the single consumer pattern in case clean_consumed is enabled. This avoid concurrency problem of multiple readers trying to delete each other consumed tail segments.
2022-06-27 11:34:09 +02:00
Ry Biesemeyer
5e372fed91
Timestamp#toString(): ensure a minimum of 3 decimal places (#14299)
* Timestamp#toString(): ensure a minimum of 3 decimal places

Logstash 8 introduced internals for nano-precise timestamps, and began relying
on `java.time.format.DateTimeFormatter#ISO_INSTANT` to produce ISO8601-format
strings via `java.time.Instant#toString()`, resulting in a _variable length_
serialization that only includes sub-second digits when the Instant represents
a moment in time with fractional seconds.

By ensuring that a timestamp's serialization has a minimum of 3 decimal places,
we ensure that our output is backward-compatible with the equivalent timestamp
produced by Logstash 7.

* timestamp serialization-related specs fixup
2022-06-24 15:20:07 -07:00
Andrea Selva
080c2f6253
Increase Gradle network timeouts to increment resiliency on network issues (#14283)
When an external repository reaches IO error or generates network timeouts the build fails in resolving external dependencies (plugins or libraries used by the project).
This commits increase a little bit those limits.
2022-06-21 15:01:08 +02:00
Karol Bucek
989f9e7937
Deps: un-pin (and avoid) rufus-scheduler (#14260)
+ Refactor: specific require + scope java_import
+ Refactor: redundant requires
+ Refactor: avoid rufus - hook up a timer task
2022-06-21 12:26:03 +02:00
Andrea Selva
7aa9d8e856
Fix/dlq avoid to delete not segment files (#14274)
Restrict the listing of segment files to delete, during reader's cleaning, to just completed segment files
2022-06-20 17:16:53 +02:00
Andrea Selva
2b88b5f29e
Print the pipeline id of the queue that's draining (#14272)
Log also the pipeline_id of the draining queue when the shutdown watcher is monitoring the shutdown of pipelines
2022-06-17 16:28:16 +02:00
Mashhur
d0c9aa8f48
File system mismatch when each pipeline uses separate file system. (#14212)
* Logstash checks different file system if each pipeline has a symlink to other filesystem.

* Apply suggestions from code review

* FileAlreadyExistsException case handling when queue path is symlinked.

Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
2022-06-16 09:22:00 -07:00
Rob Bavey
64fb24fe4a
Pipeline->pipeline workaround for jruby-9.3.4.0 bug (#14266)
This commit replaces the use of a block with a lambda as an argument for Stream.forEach.
This is to work around the jruby issue identified in https://github.com/jruby/jruby/issues/7246.

This commit also updates the multiple_pipeline_spec to update the test case for pipeline->pipeline
communication to trigger the issue - it only occurs with Streams with more than one event in it.
2022-06-16 09:48:18 -04:00
kaisecheng
c725aabb49
Fix pq size checking to not stop the pipeline (#14258)
This commit changes the behavior of PQ size checking.
When it checks the size usage, instead of throwing exception that stops the pipeline,
it gives warning msg in every converge state if it fails the check

Fixed: #14257

Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
2022-06-16 11:39:48 +01:00
Andrea Selva
fc13a4ce3e
Mark all not serializable fields as transient (#14240)
With JDK 18, the Javac lint checking was expanded to raise an error on serializable subclasses that contains instance fields which are not serializable themselves.
This could be solved suppressing the warning or marking the field as transient. The majority of classes with this problem inherit transitively from Serializable but are not intended to be serialized (from Java's serialization mechanism) because has the serialVersionUID = 1L which is just a fix to make linter happy, but are not effectively used in a serialization context.

This commit also removes a finalize method, that could be safely be removed.
2022-06-15 09:56:03 +02:00
Andrea Selva
de4f976527
Delete consumed DLQ segments and exposes API to acknowledge read events. (#14188)
Introduce a method (markForDelete) in the DeadLetterQueueReader class as way to get acknowledged when events, read by the input plugin, could be considered eligible for deletion. This mechanism is used to delete the tail segments once they result completely consumed.
Other than this, exposes a listener interface, that the client of this class has to register to get notified when the deletion of a segment happened. This is useful for the DLQ input plugin to understand when it could flush the current position to the sinceDB.

This commit also delete the, eventually present, segments files that are older than the current position, and it happens during the setCurrentReaderAndPosition execution.

Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
2022-06-14 16:45:23 +02:00
kaisecheng
d63b6ae564
Fix exception of i18n in logstash-keystore (#14246)
This PR adds the load of i18n to LogStash::Settings to fix uninitialized constant I18n exception when using `logstash-keystore`
2022-06-13 18:12:04 +01:00
Andrea Selva
99e309fe7b
Avoid to throw an exception from a finally block (#14192)
Proposes a code cleanup of the FileLockFactory.obtainLock method. It removes the nesting of ifs using a "precondition style" check, lifting up the error condition, and checking on top of the code flow.

It removed a throw exception from the clean-up finally block.
If an exception is raised from that point it hides the original cause of the problem.
This commit switches the finally block, used in obtaining a file lock, to a catch and re-throw.
2022-06-08 16:26:56 +02:00
Karol Bucek
d2b9b15bc1
Refactor: drop java.util.Collection#inspect extension (#14208)
since JRuby 9.3 a useful inspect is provided out of the box

LS' inspect: <Java::JavaUtil::ArrayList:3536147 ["some"]>
JRuby 9.3's: "#<Java::JavaUtil::ArrayList: [\"some\"]>"
2022-06-07 10:08:32 +02:00
Karol Bucek
433b341f0f
Refactor: avoid loading polyglot (#14175)
* Refactor: require treetop/runtime - avoids loading polyglot
* Build: instruct Bundler not to auto-load polyglot/treetop

+ Build: these deps are properly required as needed
all of them only used in one place (outside of normal bootstrap)
2022-06-07 08:28:56 +02:00
Karol Bucek
2b3e9a1832
Refactor: use Java API for String#split (#14207) 2022-06-07 08:13:13 +02:00
João Duarte
4d6942c240
update jackson and jackson-databind to 2.13.3 (#13945)
In jackson-databind 2.10, enabling Default Typing requires having a type validator, and while there's an "allow all" validator called LaissezFaireSubTypeValidator, this commit also tightens the validation a bit by narrowing down the allowed classes.

The default typing validator is only applied to the ObjectMapper for CBOR, which is used in the DLQ, leaving the one for JSON as-is.

Other changes:
* make ingest-converter use versions.yml for jackson-databind
* update jrjackson
2022-06-06 09:47:44 +01:00
Mashhur
886f1caed1
Fix deprecation logging of password policy. (#14159)
* Fix deprecation logging of password policy.
Give users a guide to not only upgrading but also keep current behavior if they really want.

Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
2022-06-01 16:47:59 -07:00
kaisecheng
7f36665c09
Handle out-of-date firstUnackedPageNum in head checkpoint (#14147)
This commit adds a checkpoint for a fully acked page before purging to keep the checkpoint up-to-date
Fixed: #6592

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
2022-05-25 14:38:24 +01:00
Ry Biesemeyer
e3988f534a
Revert accidental (#14158)
* Revert "Improve log readability and remove extra logging."

This reverts commit 67cdbb050d.

* Revert "Password policy deprecation log fixup."

This reverts commit 46b71b1cd4.
2022-05-24 15:15:14 -07:00
mashhur
67cdbb050d Improve log readability and remove extra logging. 2022-05-24 14:58:40 -07:00
mashhur
46b71b1cd4 Password policy deprecation log fixup. 2022-05-24 14:44:05 -07:00
Ry Biesemeyer
d8454110ba
Field Reference: handle special characters (#14044)
* add failing tests for Event.new with field that look like field references

* fix: correctly handle FieldReference-special characters in field names.

Keys passed to most methods of `ConvertedMap`, based on `IdentityHashMap`
depend on identity and not equivalence, and therefore rely on the keys being
_interned_ strings. In order to avoid hitting the JVM's global String intern
pool (which can have performance problems), operations to normalize a string
to its interned counterpart have traditionally relied on the behaviour of
`FieldReference#from` returning a likely-cached `FieldReference`, that had
an interned `key` and an empty `path`.

This is problematic on two points.

First, when `ConvertedMap` was given data with keys that _were_ valid string
field references representing a nested field (such as `[host][geo][location]`),
the implementation of `ConvertedMap#put` effectively silently discarded the
path components because it assumed them to be empty, and only the key was
kept (`location`).

Second, when `ConvertedMap` was given a map whose keys contained what the
field reference parser considered special characters but _were NOT_
valid field references, the resulting `FieldReference.IllegalSyntaxException`
caused the operation to abort.

Instead of using the `FieldReference` cache, which sits on top of objects whose
`key` and `path`-components are known to have been interned, we introduce an
internment helper on our `ConvertedMap` that is also backed by the global string
intern pool, and ensure that our field references are primed through this pool.

In addition to fixing the `ConvertedMap#newFromMap` functionality, this has
three net effects:

 - Our ConvertedMap operations still use strings
   from the global intern pool
 - We have a new, smaller cache of individual field
   names, improving lookup performance
 - Our FieldReference cache no longer is flooded
   with fragments and therefore is more likely to
   remain performant

NOTE: this does NOT create isolated intern pools, as doing so would require
      a careful audit of the possible code-paths to `ConvertedMap#putInterned`.
      The new cache is limited to 10k strings, and when more are used only
      the FIRST 10k strings will be primed into the cache, leaving the
      remainder to always hit the global String intern pool.

NOTE: by fixing this bug, we alow events to be created whose fields _CANNOT_
      be referenced with the existing FieldReference implementation.

Resolves: https://github.com/elastic/logstash/issues/13606
Resolves: https://github.com/elastic/logstash/issues/11608

* field_reference: support escape sequences

Adds a `config.field_reference.escape_style` option and a companion
command-line flag `--field-reference-escape-style` allowing a user
to opt into one of two proposed escape-sequence implementations for field
reference parsing:

 - `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes
 - `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points

The default is `NONE`, which does _not_ proccess escape sequences.
With this setting a user effectively cannot reference a field whose name
contains FieldReference-reserved characters.

| ESCAPE STYLE | `[`     | `]`     |
| ------------ | ------- | ------- |
| `NONE`       | _N/A_   | _N/A_   |
| `PERCENT`    | `%5B`   | `%5D`   |
| `AMPERSAND`  | `&#91;` | `&#93;` |

* fixup: no need to double-escape HTML-ish escape sequences in docs

* Apply suggestions from code review

Co-authored-by: Karol Bucek <kares@users.noreply.github.com>

* field-reference: load escape style in runner

* docs: sentences over semiciolons

* field-reference: faster shortcut for PERCENT escape mode

* field-reference: escape mode control downcase

* field_reference: more s/experimental/technical preview/

* field_reference: still more s/experimental/technical preview/

Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
2022-05-24 07:48:47 -07:00
Ry Biesemeyer
e6520cfee1
deps: pin rufus-scheduler to 3.7.x (#14150)
* deps: pin rufus scheduler to 3.7.x

* duplicate add_runtime_dependency 'rufus-scheduler'

Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
2022-05-24 00:50:41 -07:00
Mashhur
15dd1babf0
Simplifying HTTP basic password policy. (#14105)
* Simplifying HTTP basic password policy.
2022-05-23 21:11:10 -07:00