* specs: detangle out-of-band pipeline initialization
Our API tests were initializing their pipelines-to-test in an out-of-band
manner that prevented the agent from having complete knowledge of the
pipelines that were running. By providing a ConfigSource to our Agent's
SourceLoader, we can rely on the normal pipeline reload behaviour to ensure
that the agent fully-manages the pipelines in question.
* api: do not emit pipeline that is not fully-initialized
- Adds a new method to the public API interface
- Pass the call through the JavaFilterDelegatorExt
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* flow metrics: extract to interface, sharable-comon base, and implementation
In preparation of landing an additional implementation of FlowMetric, we
shuffle the current parts net-unchanged to provide interfaces for `FlowMetric`
and `FlowCapture`, along with a sharable-common `BaseFlowMetric`, and move
our initial implementation to a new `SimpleFlowMetric`, accessible only
through a static factory method on our new `FlowMetric` interface.
* flow-rates: refactor LIFETIME up to sharable base
* util: add SetOnceReference
* flow metrics: tolerate unavailable captures
While the metrics we capture from in the initial release of FlowMetrics
are all backed by `Metric<T extends Number>` whose values are non-null,
we will need to capture from nullable `Gauge<Number>` in order to
support persistent queue size and capacity metrics. This refactor uses
the newly-introduced `SetOnceReference` to defer our baseline lifetime
capture until one is available, and ensures `BaseFlowMetric#doCapture`
creates a capture if-and-only-if non-null values are available from
the provided metrics.
* flow rates: limit precision for readability
* flow metrics: introduce policy-driven extended windows implementation
The new ExtendedFlowMetric is an alternate implementation of the FlowMetric
introduced in Logstash 8.5.0 that is capable of producing windoes for a set of
policies, which dictate the desired retention for the rate along with a
desired resolution.
- `current`: 10s retention, 1s resolution [*]
- `last_1_minute`: one minute retention, at 3s resolution [*]
- `last_5_minutes`: five minutes retention, at 15s resolution
- `last_15_minutes`: fifteen minutes retention, at 30s resolution
- `last_1_hour`: one hour retention, at 60s resolution
- `last_24_hours`: one day retention at 15 minute resolution
A given series may report a range for slightly longer than its configured
retention period, up to the either the series' configured resolution or
our capture rate (currently ~5s), whichever is greater. This approach
allows us to retain sufficient data-points to present meaningful rolling
averages while ensuring that our memory footprint is bounded.
When recording these captures, we first stage the newest capture, and then
promote the previously-staged caputure to the tail of a linked list IFF
the gap between our new capture and the newest promoted capture is larger
than our desired resolution.
When _reading_ these rates, we compact the head of that linked list forward
in time as far as possible without crossing the desired retention barrier,
at which point the head points to the youngest record that is old enough
to satisfy the period for the series.
We also occesionally compact the head during writes, but only if the head
is significantly out-of-date relative to the allowed retention.
As implemented here, this extended flow rates are on by default, but can be
disabled by setting the JVM system property `-Dlogstash.flowMetric=simple`
* flow metrics: provide lazy-initiazed implementation
* flow metrics: append lifetime baseline if available during init
* flow metric tests: continuously monitor combined capture count
* collection of unrelated minor code-review fixes
* collection of even more unrelated minor code-review fixes
* refactor: pull members up from JavaBasePipelineExt to AbstractPipelineExt
* refactor: make `LogStash::JavaPipeline` inherit directly from `AbstractPipeline`
* Flow metrics: initial implementation (#14509)
* metrics: eliminate race condition when registering metrics
Ensure our fast-lookup and store tables cannot diverge in a race condition
by wrapping mutation of both in a single mutex and appropriately handle
another thread winning the race to the lock by using the value that it
persisted instead of writing our own.
* metrics: guard against intermediate namespace conflicts
- ensures our safeguard that prevents using an existing metric as a namespace
is applied to _intermediate_ nodes, not just the tail-node, eliminating a
potential crash when sending `fetch_or_store` to a metric object that is not
expected to respond to `fetch_or_store`.
- uses the atomic `Concurrent::Map#compute_if_absent` instead of the
non-atomic `Concurrent::Map#fetch_or_store`, which is prone to
last-write-wins during contention (as-written, this method is only
executed under lock and not subject to contention)
- uses `Enumerable#reduce` to eliminate the need for recursion
* flow: introduce auto-advancing UptimeMetric
* flow: introduce FlowMetric with minimal current/lifetime rates
* flow: initialize pipeline metrics at pipeline start
* Controller and service layer implementation for flow metrics. (#14514)
* Controller and service layer implementation for flow metrics.
* Add flow metrics to unit test and benchmark cli definitions.
* flow: fix tests for metric types to accomodate new one
* Renaming concurrency and backpressure metrics.
Rename `concurrency` to `worker_concurrency ` and `backpressure` to `queue_backpressure` to provide proper scope naming.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* metric: register flow metrics only when we have a collector (#14529)
the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.
* Unit tests and integration tests added for flow metrics. (#14527)
* Unit tests and integration tests added for flow metrics.
* Node stat spec and pipeline spec metric updates.
* Metric keys statically imported, implicit error expectation added in metric spec.
* Fix node status API spec after renaming flow metrics.
* Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates.
* metric: register flow metrics only when we have a collector (#14529)
the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.
* Unit tests and integration tests added for flow metrics.
* Node stat spec and pipeline spec metric updates.
* Metric keys statically imported, implicit error expectation added in metric spec.
* Fix node status API spec after renaming flow metrics.
* Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates.
* Rebasing with feature branch.
* metric: register flow metrics only when we have a collector
the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.
* Apply suggestions from code review
Integration tests updated to test capturing the flow metrics.
* Flow metrics expectation updated in tegration tests.
* flow: refine integration expectations for reloads/monitoring
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
Co-authored-by: Mashhur <mashhur.sattorov@gmail.com>
* metric: add ScaledView with sub-unit precision to UptimeMetric (#14525)
* metric: add ScaledView with sub-unit precision to UptimeMetric
By presenting a _view_ of our metric that maintains sub-unit precision,
we prevent jitter that can be caused by our periodic poller not running at
exactly our configured cadence.
This is especially important as the UptimeMetric is used as the _denominator_ of
several flow metrics, and a capture at 4.999s that truncates to 4s, causes the
rate to be over-reported by ~25%.
The `UptimeMetric.ScaledView` implements `Metric<Number>`, so its full
lossless `BigDecimal` value is accessible to our `FlowMetric` at query time.
* metrics: reduce window for too-frequent-captures bug and document it
* fixup: provide mocked clock to flow metric
* Flow metrics cleanup (#14535)
* flow metrics: code-style and readability pass
* remove unused imports
* cleanup: simplify usage of internal helpers
* flow: migrate internals to use OptionalDouble
* Flow metrics global (#14539)
* flow: add global top-level flows
* docs: add flow metrics
* Top level flow metrics unit tests added. (#14540)
* Top level flow metrics unit tests added.
* Add unit tests when config reloads, make sure top-level flow metrics didn't get reset.
* Apply suggestions from code review
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Validating against Hash test cases updated.
* For the safety check against exact type in unit tests.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* docs: section links and clarity in node stats API flow metrics
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Mashhur <mashhur.sattorov@gmail.com>
Remove support for DES-CBC3-SHA.
Also removes unnecessary exclusions for EDH-DSS-DES-CBC3-SHA, EDH-RSA-DES-CBC3-SHA and KRB5-DES-CBC3-SHA since there's already a "!DES" rule.
SettingsImpl.checkpointRetry is hardcoded to false in builder. Prior to this change, users are unable to set queue.checkpoint.retry to true to enable Windows retry on PQ AccessDeniedException in checkpoint.
Fixed: #14486
Update the following java dependencies:
* org.reflections:reflections
* commons-codec:commons-codec
* com.google.guava:guava
* com.google.googlejavaformat:google-java-format
* org.javassist:javassist
The goal of these updates is to not fall behind and avoid surprises when an upgrade is necessary due to a security issue.
Prior to this commit, the java filter delegator would return "java" to the
`#threadsafe?` call when determining whether a filter is considered thread safe.
This is correct for the _outputs_, where this prefix is used to construct the
appropriate delegator object, but not for filters, where a simple true/false is
required. This commit replaces the `getCurrency` call with a `isThreadsafe` call
to make it clearer that a boolean call is required.
This test has been broken on Windows since the jruby 9.3 update, and is
similar to an issue we saw when referencing other classes in the
org.logstash.util namespace
The Module is broken with the current version. The Type needs to be changed from syslog to _doc to fix the issue.
* remove dangling setting and add arcsight index suffixes
* add tests for new suffix in arcsight module
Co-authored-by: Tobias Schröer <tobias@schroeer.ch>
Open the ability to use Ruby codec inside Java plugins.
Java plugins need subclasses of Java `co.elastic.logstash.api.Codec` class to properly work. This PR implements an adapter for Ruby codecs to be wrapped into a Java's Codec subclass.
Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
Increments the counters of consumed events and consumed events every time a segment is completely read from the reader side.
Publish the collected data to the input plugin through the newly added method `segmentsDeleted(numberOfSegments, numberOfEvents` to `SegmentListener`.
Bugfix for DLQ when writer and reader with clean_consumed feature are working on same queue.
The writer side keeps the size of DLQ in memory variable and when exceed it applies the storage policy, so essentially drop older or never events.
If on the opposite side of the DLQ there is a read, with clean_consumed feature enabled, that delete the segments it processes, then the writer side could have a not consistent information about the DLQ size.
This commit adds a filesystem check when it recognizes that "DLQ full" condition is reached, to avoid consider false positive as valid data.
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
Adds the metric measure `expired_events` to node stats graph to track the number of events removed by age retention pruning policy.
In the _node/stats API, under the path `pipelines.<pipeline upstream>.dead_letter_queue` enrich the document with the new metric.
The number of expired events is incremented every time a segment file is deleted; this happens opening the file and counting the `complete` and `start` records.
Updates DLQ writer's writeEvent method to clean the tail segments older then the duration period. This happens only if setting dead_letter_queue.retain.age is configured.
To read the age of a segment it extract the timestamp of the last (youngest) message in the segment.
The age is defined as a number followed by one of d, h, m, s that stands for days, hours, minutes and seconds. If nothing is used then assumes seconds as default measure entity.
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
Uses the FileLock mechanism, already used in PQ, to force the single consumer pattern in case clean_consumed is enabled. This avoid concurrency problem of multiple readers trying to delete each other consumed tail segments.
* Timestamp#toString(): ensure a minimum of 3 decimal places
Logstash 8 introduced internals for nano-precise timestamps, and began relying
on `java.time.format.DateTimeFormatter#ISO_INSTANT` to produce ISO8601-format
strings via `java.time.Instant#toString()`, resulting in a _variable length_
serialization that only includes sub-second digits when the Instant represents
a moment in time with fractional seconds.
By ensuring that a timestamp's serialization has a minimum of 3 decimal places,
we ensure that our output is backward-compatible with the equivalent timestamp
produced by Logstash 7.
* timestamp serialization-related specs fixup
When an external repository reaches IO error or generates network timeouts the build fails in resolving external dependencies (plugins or libraries used by the project).
This commits increase a little bit those limits.
* Logstash checks different file system if each pipeline has a symlink to other filesystem.
* Apply suggestions from code review
* FileAlreadyExistsException case handling when queue path is symlinked.
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
This commit replaces the use of a block with a lambda as an argument for Stream.forEach.
This is to work around the jruby issue identified in https://github.com/jruby/jruby/issues/7246.
This commit also updates the multiple_pipeline_spec to update the test case for pipeline->pipeline
communication to trigger the issue - it only occurs with Streams with more than one event in it.
This commit changes the behavior of PQ size checking.
When it checks the size usage, instead of throwing exception that stops the pipeline,
it gives warning msg in every converge state if it fails the check
Fixed: #14257
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
With JDK 18, the Javac lint checking was expanded to raise an error on serializable subclasses that contains instance fields which are not serializable themselves.
This could be solved suppressing the warning or marking the field as transient. The majority of classes with this problem inherit transitively from Serializable but are not intended to be serialized (from Java's serialization mechanism) because has the serialVersionUID = 1L which is just a fix to make linter happy, but are not effectively used in a serialization context.
This commit also removes a finalize method, that could be safely be removed.
Introduce a method (markForDelete) in the DeadLetterQueueReader class as way to get acknowledged when events, read by the input plugin, could be considered eligible for deletion. This mechanism is used to delete the tail segments once they result completely consumed.
Other than this, exposes a listener interface, that the client of this class has to register to get notified when the deletion of a segment happened. This is useful for the DLQ input plugin to understand when it could flush the current position to the sinceDB.
This commit also delete the, eventually present, segments files that are older than the current position, and it happens during the setCurrentReaderAndPosition execution.
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
Proposes a code cleanup of the FileLockFactory.obtainLock method. It removes the nesting of ifs using a "precondition style" check, lifting up the error condition, and checking on top of the code flow.
It removed a throw exception from the clean-up finally block.
If an exception is raised from that point it hides the original cause of the problem.
This commit switches the finally block, used in obtaining a file lock, to a catch and re-throw.
since JRuby 9.3 a useful inspect is provided out of the box
LS' inspect: <Java::JavaUtil::ArrayList:3536147 ["some"]>
JRuby 9.3's: "#<Java::JavaUtil::ArrayList: [\"some\"]>"
* Refactor: require treetop/runtime - avoids loading polyglot
* Build: instruct Bundler not to auto-load polyglot/treetop
+ Build: these deps are properly required as needed
all of them only used in one place (outside of normal bootstrap)
In jackson-databind 2.10, enabling Default Typing requires having a type validator, and while there's an "allow all" validator called LaissezFaireSubTypeValidator, this commit also tightens the validation a bit by narrowing down the allowed classes.
The default typing validator is only applied to the ObjectMapper for CBOR, which is used in the DLQ, leaving the one for JSON as-is.
Other changes:
* make ingest-converter use versions.yml for jackson-databind
* update jrjackson
* Fix deprecation logging of password policy.
Give users a guide to not only upgrading but also keep current behavior if they really want.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
This commit adds a checkpoint for a fully acked page before purging to keep the checkpoint up-to-date
Fixed: #6592
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
* add failing tests for Event.new with field that look like field references
* fix: correctly handle FieldReference-special characters in field names.
Keys passed to most methods of `ConvertedMap`, based on `IdentityHashMap`
depend on identity and not equivalence, and therefore rely on the keys being
_interned_ strings. In order to avoid hitting the JVM's global String intern
pool (which can have performance problems), operations to normalize a string
to its interned counterpart have traditionally relied on the behaviour of
`FieldReference#from` returning a likely-cached `FieldReference`, that had
an interned `key` and an empty `path`.
This is problematic on two points.
First, when `ConvertedMap` was given data with keys that _were_ valid string
field references representing a nested field (such as `[host][geo][location]`),
the implementation of `ConvertedMap#put` effectively silently discarded the
path components because it assumed them to be empty, and only the key was
kept (`location`).
Second, when `ConvertedMap` was given a map whose keys contained what the
field reference parser considered special characters but _were NOT_
valid field references, the resulting `FieldReference.IllegalSyntaxException`
caused the operation to abort.
Instead of using the `FieldReference` cache, which sits on top of objects whose
`key` and `path`-components are known to have been interned, we introduce an
internment helper on our `ConvertedMap` that is also backed by the global string
intern pool, and ensure that our field references are primed through this pool.
In addition to fixing the `ConvertedMap#newFromMap` functionality, this has
three net effects:
- Our ConvertedMap operations still use strings
from the global intern pool
- We have a new, smaller cache of individual field
names, improving lookup performance
- Our FieldReference cache no longer is flooded
with fragments and therefore is more likely to
remain performant
NOTE: this does NOT create isolated intern pools, as doing so would require
a careful audit of the possible code-paths to `ConvertedMap#putInterned`.
The new cache is limited to 10k strings, and when more are used only
the FIRST 10k strings will be primed into the cache, leaving the
remainder to always hit the global String intern pool.
NOTE: by fixing this bug, we alow events to be created whose fields _CANNOT_
be referenced with the existing FieldReference implementation.
Resolves: https://github.com/elastic/logstash/issues/13606
Resolves: https://github.com/elastic/logstash/issues/11608
* field_reference: support escape sequences
Adds a `config.field_reference.escape_style` option and a companion
command-line flag `--field-reference-escape-style` allowing a user
to opt into one of two proposed escape-sequence implementations for field
reference parsing:
- `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes
- `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points
The default is `NONE`, which does _not_ proccess escape sequences.
With this setting a user effectively cannot reference a field whose name
contains FieldReference-reserved characters.
| ESCAPE STYLE | `[` | `]` |
| ------------ | ------- | ------- |
| `NONE` | _N/A_ | _N/A_ |
| `PERCENT` | `%5B` | `%5D` |
| `AMPERSAND` | `[` | `]` |
* fixup: no need to double-escape HTML-ish escape sequences in docs
* Apply suggestions from code review
Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
* field-reference: load escape style in runner
* docs: sentences over semiciolons
* field-reference: faster shortcut for PERCENT escape mode
* field-reference: escape mode control downcase
* field_reference: more s/experimental/technical preview/
* field_reference: still more s/experimental/technical preview/
Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
* Add support for ca_trusted_fingerprint in Apache HTTP and Manticore
Adds a module `LogStash::Plugins::CATrustedFingerprintSupport`, which can be
included in a plugin class to add a `ca_trusted_fingerprint` option to create
an Apache SSL TrustStrategy that can be used to bypass the TrustManager when
a matching certificate is found on the chain.
When PQ is full, workers wake up writer thread in every read.
However, without removing a fully acked page, queue is still full.
This commit changes the condition of notFull signal.
Fixed: #6801
This commit updates the version of jruby used in Logstash to `9.3.4.0`.
* Updates the references of `jruby` from `9.2.20.1` to `9.3.4.0`
* Updates references/locations of ruby from `2.5.0` to `2.6.0`
* Updates java imports including `org.logstash.util` to be quoted
* Without quoting the name of the import, the following error is observed in tests:
* `java.lang.NoClassDefFoundError: org/logstash/Util (wrong name: org/logstash/util)`
* Maybe an instance of https://github.com/jruby/jruby/issues/4861
* Adds a monkey patch to `require` to resolve compatibility issue between latest `jruby` and `polyglot` gem
* The addition of https://github.com/jruby/jruby/pull/7145 to disallow circular
causes, will throw when `polyglot` is thrown into the mix, and stop logstash from
starting and building - any gems that use an exception to determine whether or not
to load the native gem, will trigger the code added in that commit.
* This commit adds a monkey patch of `require` to rollback the circular cause exception
back to the original cause.
* Removes the use of the deprecated `JavaClass`
* Adds additional `require time` in `generate_build_metadata`
* Rewrites a test helper to avoid potentially calling `~>` on `FalseClass`
Co-authored-by: Joao Duarte <jsvduarte@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
This commit moves the JvmOptionParser into its own gradle project.
This enables the JvmOptionParser to remain compatible with Java 1.8 to present a helpful error message to a user attempting to start Logstash using older versions of Java, while allowing the main Logstash code base to freely use idiomatic Java 11 features.