* live timers: introduce API boundary
Introduces an API boundary for timers as a first-class metric, as described
in elastic/logstash#14675, and migrates all known internal timers to use the
new API boundary for tracked execution.
Please refer to the specification for details on motivations.
This commit is net zero change to behaviour, and introduces a single new
undocumented setting `metric.timers` to `logstash.yml`, which presently only
takes its default value `delayed` to indicate that delayed committing of
execution time is acceptable.
It implements the new `TimerMetric` API in a way that is also net-zero-change.
Tracked executions are still performed by marking a start time, performing
the tracked execution, and incrementing an underlying long-type counter with
the number of elapsed milliseconds _after_ execution has completed. This means
that long-running execution is still missing from the metric until it has
completed.
The new Timer API is available to both the Ruby- and the Java-based plugin APIs
* timer metrics: sub-package and add baseline tests
* WIP: move execution metric ownership out of queue
* noop: remove useless abstract method
Our `AbstractMetric` implements `Metric` and does not need to declare
an abstract override of `Metric#getType`. Doing so prevents interfaces
from providing a default override for all implementers.
* timer metric tests: extract util, refactor for reuse
* timers: accumulate milli-excess-nanos
* live timers: single-checkpoint implementation
* timer metric: use explicit type parameters to make intent clear
* remove unused imports
* use safe int conversion
* test fixup: use given name for tested metric
* test helper: TimerMetricFactory prefers nanotime supplier
* timers: flesh out test coverage, incl live-timers
* test: move validation of queue-read metrics to ObservedExecution
* flow: support non-moving denominator (±infinity)
* metrics: add metric config pass-through to env2yaml
* Remove unnecessary IR graph loggings.
* Add the guides for protecting sensitive info leaking in the logs.
`CONTRIBUTING.md` now contains the guide to prevent sensitive info leaking in the debug logs.
Fixes: #14778
Pull-request: #14779
* Add sample source codes to the `CONTRIBUTING.md`.
* Apply suggestions from code review
Contribution guide for using sensitive data in Logstash improved.
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
During stalled shutdowns while waiting for in-flight batches to complete,
our shutdown watcher emits helpful information about what work is in flight,
including the actual threads and plugins that are still executing.
Since ~6.3.0, the `inflight_count` metric in this log message has always
been `0`, in part because of two somewhat-overlapping bugs:
- elastic/logstash#8987 and elastic/logstash#9056 (7.0, 6.3) changed
the `inflight_batches` map provided by the queue read clients to index
batches by native thread id, but pipeline reporter continued to
attempt to extract by ruby thread object. Because it does not find
the thread in the "batch map", it reports zero.
- elastic/logstash#9111 (7.0, 6.3) changed the _value_ stored in
the `inflight_batches` map provided by a new common queue read client
from an object responding to `#size` to a java `QueueBatch` which
does not respond to `size`. If our pipeline reporter had been able to
look up the queue batch, it would have failed with a `NoMethodError`.
We resolve the issue by (1) extracting the batch from our "batch map" using
the native thread id and (2) safely extracting the value from a `QueueBatch`
before falling through to `Object#size` or 0.
This PR presents a getting started guide to working with Logstash in Kubernetes, and comes in three parts:
- A walkthrough of a sample scenario, setting up a Kubernetes cluster with
- Elasticsearch
- Kibana
- Filebeat to read Kubernetes API server logs and sent to Elasticsearch via Logstash
- Logstash
- Metricbeat to monitor Logstash
and walks through the setup of these files
- An annotated guide to these files
- Explanations of how Logstash configuration and settings files map to the Kubernetes environment
Relates #14576Closes#14576
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Starting with Log4j2 2.6 if a subclass of MessageFactory associated with an Logger instance
is not subclass of MessageFactory2, then it's wrapped with MessageFactory2Adapter.
This trigger a log4j warn log that, when a class subclasses LogStash::Plugin for example, is noisy and report about
a Logger is not associated with the default MessagedFactory (LogstashMessageFactory) every time a subclass of Plugins is instantiated.
This commit adapt LogstashMessageFactory to implement the MessagedFactory2 instead of the older MessageFactory to avoid the wrapping with the adapter class.
* source/multilocal: fix detection of empty pipelines.yml
Fixes a regression introduced in elastic/logstash#13883 in which the presence
of an empty `pipelines.yml` file produces an error message indicating that
the file cannot be read.
When either `YAML::load` or `YAML::safe_load` encounter an effectively-empty
payload (such as one that is entirely comments), they use a `fallback` param
to determine what value to emit, with the former emitting `false` and the
latter emitting `nil`.
This is problematic because a _separate_ blind-`rescue nil` causes `nil` to
be bound to the MultiLocal's `@detected_marker`, and we assume that a `nil`
value in the marker means that there was an exception reading the file (such
as a permissions issue or parse failure).
By providing a `fallback: false` directive when parsing the contents, we
ensure that an empty file is reported as such.
* source/multilocal: avoid `rescue nil` that loses helpful context
When the pipelines yaml cannot be read, or can be read but fails to parse,
the MultiLocal#read_pipelines_from_yaml emits a helpful exception including
specifics about why it failed to load or parse, but a blind `rescue nil`
here causes that helpful information to be lost.
When pipeline detection is exceptional, hold onto the helpful exception
so that it can be reported along with the config conflicts.
* source/multilocal: differentiate between reading and parsing failure
* source/multilocal: use translations for conflict messages
* source/multilocal: specs for error conditions
doc for k8s troubleshooting and common issues
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>
Ensures the DRA build script surfaces a rake error, instead of allowing the build to continue.
This ensures that the build doesn't continue if any of the steps fails.
Co-authored-by: Rob Bavey <rob.bavey@elastic.co>
Version 7.17 doesn't generate Darwin aarch64 artifacts. Don't download these artifacts from the GCS bucket, given that we don't build Darwin for that release.
Exclude Jruby's bundler and rake from the built artifacts. The artifacts don't need to ship with such dependencies. Also, Logstash will bundle its own bundler for plugin management but it is not the one shipped with jruby.
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Fixes the source of dra_common.sh. It will now first check the directory of the file from which this dra_common.sh script is being called. This allows the common script to be sourced regardless of where the sourcing script is being called from.
The changes remove some code duplication by introducing a common file that can be sourced between all scripts. It also improves debuggability by adding better messages.
* Collect growth events and bytes metrics if PQ is enabled: Java changes.
* Move queue flow under queue namespace.
* Pipeline level PQ flow metrics: add unit & integration tests.
* Include queue info in node stats sample.
* Apply suggestions from code review
Change uptime precision for PQ growth metrics to uptime seconds since PQ events are based on seconds.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Add safeguard when using lazy delegating gauge type.
* flow metrics: simplify generics of lazy implementation
Enables interface `FlowMetrics::create` to take suppliers that _implement_
a `Metric<? extends Number>` instead of requiring them to be pre-cast, and
avoid unnecessary exposure of the metrics value-type into our lazy init.
* flow metrics: use lazy init for PQ gauge-based metrics
* noop: use enum equality
Avoids routing two enum values through `MetricType#toString()`
and `String#equals()` when they can be compared directly.
* Apply suggestions from code review
Optional.ofNullable used for safe return. Doc includes real tested expected metric values.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* flow metrics: make lazy-init wraper inherit from AbstractMetric
this allows the Jackson serialization annotations to work
* flow metrics: move pipeline queue-based flows into pipeline flow namespace
* Follow up for moving PQ growth metrics under pipeline.*.flow.
- Unit and integration tests are added or fixed.
- Documentation added along with sample response data
* flow: pipeline pq flow rates docs
* Do not expect flow in the queue section of API. Metrics moved to flow section.
Update logstash-core/spec/logstash/api/commands/stats_spec.rb
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Integration test failure fix.
Mistake: `flow_status` should be `pipeline_flow_stats`
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Integration test failures fix.
Number should be Numeric in the ruby specs.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Make CI happy.
* api specs: use PQ only where needed
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
* DRA: Handle env variables better
* Moved the addition of SNAPSHOT suffix to the version after the VERSION_QUALIFIER
* Fix badly assigned variable, version qualifier has to be appended also to PLAIN_STACK_VERSION and not RELEASE_VER
Co-authored-by: andsel <selva.andre@gmail.com>