* fix: restore support for unicode pipeline- and plugin-id's
JRuby's `Ruby#newSymbol(String)` throws an exception when provided a `String`
that contains characters outside of lower-ASCII because JRuby internals expect
"the incoming String to be one of our mangled ISO-8859-1 strings" as noted in
a comment on jruby/jruby#6217.
Instead, we use `Ruby#newString(String)` to create a new `RubyString` (which
works properly), and then rely on `RubyString#intern` to get our `RubySymbol`.
This fixes a regression introduced in the 8.7 series in which pipeline id's
are consistently represented as ruby symbols in the metrics store, and ensures
similar issue does not exist when specifying a plugin id that contains
characters above the lower-ASCII plane.
* fix: use properly-encoded RubySymbol in PipelineConfig
We cannot rely on `RubySymbol#toString` to produce a properly-encoded `String`
whe the string contains characters above the lower-ASCII plane because the
result is effectively a binary ruby-internal marshal of the bytes that only
holds when the symbol contains lower-ASCII.
Instead, we can use the internally-memoizing `RubySymbol#name` to get a
properly-encoded `RubyString`, and `RubyString#asJavaString()` to get a
properly-encoded java-`String`.
* fix: properly serialize unicode pipeline names in API output
Jackson's JSON serializer leaks the JRuby-internal byte structure of Symbols,
which only aligns with the byte-structure of the symbol's actual string when
that string is wholly-comprised of lower-ASCII characters.
By pre-converting Symbols to Strings, we ensure that the result is readable
and useful.
* spec: bypass monitoring specs for unicode pipeline ids when PQ enabled
(cherry picked from commit 0ec16ca398)
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Introduce a new setting named `pipeline.buffer.type` which could be valued direct or heap to enable the allocation on Java heap.
The processing of the setting is done in `LogStash::Runner#execute` and sets the Java properties considered by Netty to disable the direct allocation: `io.netty.noPreferDirect`.
However, if that system property is already configured explicitly by the user (because set in `jvm.options`or `LS_JAVA_OPTS`) the setting doesn't take place and warning log is reported, respecting the user's will.
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
PR#15900 missed a few more places where Logstash is installed but
a working minimal pipeline config is added.
This commit fixes that and stabilizes all acceptance tests, thus
minizing the need for time consuming BK retries of corresponding
steps.
Relates #15900
Relates https://github.com/elastic/logstash/issues/15784
This commit tightens the checks for the status
output of the Logstash OS service to specifically
scan for `org.logstash.Logstash` rather than
only the jdk path.
The reason is that the startup script first runs
an options parser, and then the logstash process
itself, both referencing the JDK path.
Closes https://github.com/elastic/ingest-dev/issues/2950
This commit fixes the startup of the Logstash service during packaging
tests by adding a minimal pipeline config. Without it, the service was
flapping from start to start and vice versa causing test flakiness.
Relates https://github.com/elastic/logstash/issues/15784
The current mechanism of discovering the latest released version per
branch (via ARTIFACTS_API) isn't foolproof near the time of a new
release, as it may be pick a version that hasn't been released
yet. This leads to failures[^1] of the packaging upgrade tests, as we
attempt to download a package file that doesn't exist yet.
This commit switches to an API that that is more up to date regarding
the release version truth.
[^1]: https://buildkite.com/elastic/logstash-exhaustive-tests-pipeline/builds/125#018d319b-9a33-4306-b7f2-5b41937a8881/1033-1125
This commit fixes the flaky IT test:
`install non bundle plugin successfully installs the plugin with debug enabled`
by being a bit more lenient with the output which can get garbled by Bundler.
Closes#15801
This commit modernizes the qa/acceptance (packaging) test framework by
moving away from Vagrant and having the tests operate locally.
As we are migrating to Buildkite, the expectation is that those tests
will run on dedicated vms thus removing the necessity of vagrant.
Relates: https://github.com/elastic/ingest-dev/issues/1722
Set of changes to make Logstash compatible to JRuby 9.4.
Bundle JRuby 9.4.3.0
- Redefine space token in `LSCL` and `grammar` treetop from `_` which would generated methods in the form `def _0` (deprecated since `2.7`) to `sc`.
- `I18n.t` method doesn't accept hash as second argument
- `URI.encode` has been replaced with same functionality with `URI::Parser.new.escape`
- `YAML.load` needs explicit `fallback: false` to return false when the yaml string is empty (or contains only comments)
- JRuby's `JavaClass` has been removed, now it can use `java.lang.Class` directly
- explicitly require gem `thwait` to satisfy `require "thwait"` (In `Gemfile.template` and `logstash-core/logstash-core.gemspec`)
- fix not args `clone` to be `def clone(*args)`
- fix `Enumeration.each_slice` which from `Ruby 3.1` is [chainable](https://rubyreferences.github.io/rubychanges/3.1.html#enumerableeach_cons-and-each_slice-return-a-receiver) and doesn't return `nil`. JRuby fixed in https://github.com/jruby/jruby/issues/7015
- Expanded `Down.download` arguments map ca16bbed3c302006967413eb9d3862f2da81f7ae
- Avoid to pass `nil` in the list of couples used in `Hash[ <list of couples> ]` which from Ruby `3.0` generates an `ArgumentError`
- Removed space not allowed between method name and parentheses `initialize (` is forbidden. 29b607dcdef98f81a73ad171639fd13aaa65e243
- With [Ruby 2.7 the `Kernel#open`](https://rubyreferences.github.io/rubychanges/2.7.html#network-and-web) doesn't fallback to `URI#open`, fixed test code that used that to verify open port. e5b70de54c5301f51a767da67294092af0cfafdc
- Avoid to drop `rdoc/` folder from vendored JRuby else `bin/logstash -i irb` would crash, commit b71f73e9c6edb81a7b7ae1305047e506f61c6e8c
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Reject illegal value assigning to `tags` field. Top-level `tags` should only accept string of array of string.
When `tags` got illegal value on event creation, LogStash::Event will rename the field to `_tags` and add a tag `_tagsparsefailure` to `tags`.
When `tags` got illegal value on `set` operation, LogStash::Event will throw exception.
Add a flag `--event_api.tags.illegal` to allow fallback to old logic. There are two options.
`warn` - the old flow that allows illegal value assignment to tags field.
`rename` - the new flow. This is the default value in 8.7
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
* Initial effort to initialize plugin flow metrics. Followings are addressed:
- Namespace store is shaped with RubySymbol key but filter and output codecs were using string key. This commit intends to standardize the namespace key with RubySymbol for filter & output codecs.
- Initializes throughput flow metrics for the input plugins.
- Initializes the worker cost per event and worker utilization for the filter and output plugins with only uptime metrics but it should combine with worker count, will be implemented in next commits.
- Fetching codec ID generated in ruby scope is possible but problematic to in Java scope. We will skip codec flow metrics since they are rarely produce the hard times.
* Worker utilization metrics implementation.
- Worker count will be provided as a fraction to the flow metrics. At the time when we fetch the metric value, fraction is applied.
* Unit tests added for fractured extended & simple metrics.
* Code review change requests applied.
- To simplify the scale (or fraction) at metric get value time, we can introduce the wrapper (`UpScaleMetric`) that applies the scale at metric value fetch time.
- Unit test added for `UpScaleMetric`
- We don't touch the codec namespace shape for now since we skipped codec metrics.
- Unused sources removed.
* Worker utilization and worker cost per event explanation added in the documentation.
* Integration test added for plugin-level flow metrics.
* Apply suggestions from code review
- Integration test failure fix: input plugin ID is not always in context config.
- Suggestions to simplify integration test source and rollback to intentional namings.
- Metrics explanation improvement in the doc.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* plugin flow: fix units; pass UptimeMetric and scale when needed
Aligns the units of the newly-introduced plugin metrics with the specification,
and passes our `UptimeMetric` through to the individual helper methods so that
they can scale appropriately for their context and our type-checker can ensure
we don't receive an incorrectly-scaled `Metric<Long>`.
Input `throughput`
------------------
all throughput metrics should be expressed in events-per-second; this
per-plugin scoped view of the pipeline's `input_throughput` flow should be
expressed in the same units.
Filters, Outputs `worker_utilization`
-------------------------------------
> a worker_utilization (duration / (uptime * worker count)) shows what percent
> of available resources an individual plugin instance is taking and can help
> identify where the blocker is.
To achieve this, we need to divide millis used by _millis_ available.
Filters, Outputs `worker_cost_per_event`
----------------------------------------
> we also provide a (to be named) cost-per-event metric (duration / event) to
> surface issues with a plugin that operates on a very small subset of events
> (via conditionals) but contributes disproportionately to the cost of getting
> its events through.
We start with a baseline of seconds-per-event, and acknowledge that this may
need to be scaled to a more understandable number before merging.
* plugin flow: express cost per event in millis per event
The "worker cost per event" metric when expressed as an inverse per-worker
throughput in seconds-per-event produces a range of values that are not
particularly easy to compare at-a-glance, with "nearly free" operations
being expressed in negative-exponent scientific notation and extremely
expensive operations being expressed with single-digits.
By scaling this metric up by a factor of 1000 to "millis per event" or its
eqivalent "seconds per thousand events", the resulting numbers in practice
are easier to make sense of:
+------------------------+--------------+---------------+------------+
| EXAMPLE / SCALE | s/event | ms/event | µs/event |
+------------------------+--------------+---------------+------------+
| no-op mutate @ 12k eps | 8.33e-05 | 0.0833 | 83.3 |
| stdout w/ dots codec | 0.000831 | 0.831 | 831 |
| ES out 1s RTT/125 | 0.008 | 8 | 8000 |
| ES out 30s retries/125 | 0.24 | 240 | 240000 |
| ES filter 1s/event | 1 | 1000 | 1000000 |
| grok 30s timeout | 30 | 30000 | 30000000 |
+------------------------+--------------+---------------+------------+
* plugin flow: reshape docs
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
* source/multilocal: fix detection of empty pipelines.yml
Fixes a regression introduced in elastic/logstash#13883 in which the presence
of an empty `pipelines.yml` file produces an error message indicating that
the file cannot be read.
When either `YAML::load` or `YAML::safe_load` encounter an effectively-empty
payload (such as one that is entirely comments), they use a `fallback` param
to determine what value to emit, with the former emitting `false` and the
latter emitting `nil`.
This is problematic because a _separate_ blind-`rescue nil` causes `nil` to
be bound to the MultiLocal's `@detected_marker`, and we assume that a `nil`
value in the marker means that there was an exception reading the file (such
as a permissions issue or parse failure).
By providing a `fallback: false` directive when parsing the contents, we
ensure that an empty file is reported as such.
* source/multilocal: avoid `rescue nil` that loses helpful context
When the pipelines yaml cannot be read, or can be read but fails to parse,
the MultiLocal#read_pipelines_from_yaml emits a helpful exception including
specifics about why it failed to load or parse, but a blind `rescue nil`
here causes that helpful information to be lost.
When pipeline detection is exceptional, hold onto the helpful exception
so that it can be reported along with the config conflicts.
* source/multilocal: differentiate between reading and parsing failure
* source/multilocal: use translations for conflict messages
* source/multilocal: specs for error conditions
* Collect growth events and bytes metrics if PQ is enabled: Java changes.
* Move queue flow under queue namespace.
* Pipeline level PQ flow metrics: add unit & integration tests.
* Include queue info in node stats sample.
* Apply suggestions from code review
Change uptime precision for PQ growth metrics to uptime seconds since PQ events are based on seconds.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Add safeguard when using lazy delegating gauge type.
* flow metrics: simplify generics of lazy implementation
Enables interface `FlowMetrics::create` to take suppliers that _implement_
a `Metric<? extends Number>` instead of requiring them to be pre-cast, and
avoid unnecessary exposure of the metrics value-type into our lazy init.
* flow metrics: use lazy init for PQ gauge-based metrics
* noop: use enum equality
Avoids routing two enum values through `MetricType#toString()`
and `String#equals()` when they can be compared directly.
* Apply suggestions from code review
Optional.ofNullable used for safe return. Doc includes real tested expected metric values.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* flow metrics: make lazy-init wraper inherit from AbstractMetric
this allows the Jackson serialization annotations to work
* flow metrics: move pipeline queue-based flows into pipeline flow namespace
* Follow up for moving PQ growth metrics under pipeline.*.flow.
- Unit and integration tests are added or fixed.
- Documentation added along with sample response data
* flow: pipeline pq flow rates docs
* Do not expect flow in the queue section of API. Metrics moved to flow section.
Update logstash-core/spec/logstash/api/commands/stats_spec.rb
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Integration test failure fix.
Mistake: `flow_status` should be `pipeline_flow_stats`
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Integration test failures fix.
Number should be Numeric in the ruby specs.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Make CI happy.
* api specs: use PQ only where needed
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
* Flow metrics: initial implementation (#14509)
* metrics: eliminate race condition when registering metrics
Ensure our fast-lookup and store tables cannot diverge in a race condition
by wrapping mutation of both in a single mutex and appropriately handle
another thread winning the race to the lock by using the value that it
persisted instead of writing our own.
* metrics: guard against intermediate namespace conflicts
- ensures our safeguard that prevents using an existing metric as a namespace
is applied to _intermediate_ nodes, not just the tail-node, eliminating a
potential crash when sending `fetch_or_store` to a metric object that is not
expected to respond to `fetch_or_store`.
- uses the atomic `Concurrent::Map#compute_if_absent` instead of the
non-atomic `Concurrent::Map#fetch_or_store`, which is prone to
last-write-wins during contention (as-written, this method is only
executed under lock and not subject to contention)
- uses `Enumerable#reduce` to eliminate the need for recursion
* flow: introduce auto-advancing UptimeMetric
* flow: introduce FlowMetric with minimal current/lifetime rates
* flow: initialize pipeline metrics at pipeline start
* Controller and service layer implementation for flow metrics. (#14514)
* Controller and service layer implementation for flow metrics.
* Add flow metrics to unit test and benchmark cli definitions.
* flow: fix tests for metric types to accomodate new one
* Renaming concurrency and backpressure metrics.
Rename `concurrency` to `worker_concurrency ` and `backpressure` to `queue_backpressure` to provide proper scope naming.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* metric: register flow metrics only when we have a collector (#14529)
the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.
* Unit tests and integration tests added for flow metrics. (#14527)
* Unit tests and integration tests added for flow metrics.
* Node stat spec and pipeline spec metric updates.
* Metric keys statically imported, implicit error expectation added in metric spec.
* Fix node status API spec after renaming flow metrics.
* Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates.
* metric: register flow metrics only when we have a collector (#14529)
the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.
* Unit tests and integration tests added for flow metrics.
* Node stat spec and pipeline spec metric updates.
* Metric keys statically imported, implicit error expectation added in metric spec.
* Fix node status API spec after renaming flow metrics.
* Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates.
* Rebasing with feature branch.
* metric: register flow metrics only when we have a collector
the collector is absent when the pipeline is run in test with a
NullMetricExt, or when the pipeline is explicitly configured to
not collect metrics using `metric.collect: false`.
* Apply suggestions from code review
Integration tests updated to test capturing the flow metrics.
* Flow metrics expectation updated in tegration tests.
* flow: refine integration expectations for reloads/monitoring
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
Co-authored-by: Mashhur <mashhur.sattorov@gmail.com>
* metric: add ScaledView with sub-unit precision to UptimeMetric (#14525)
* metric: add ScaledView with sub-unit precision to UptimeMetric
By presenting a _view_ of our metric that maintains sub-unit precision,
we prevent jitter that can be caused by our periodic poller not running at
exactly our configured cadence.
This is especially important as the UptimeMetric is used as the _denominator_ of
several flow metrics, and a capture at 4.999s that truncates to 4s, causes the
rate to be over-reported by ~25%.
The `UptimeMetric.ScaledView` implements `Metric<Number>`, so its full
lossless `BigDecimal` value is accessible to our `FlowMetric` at query time.
* metrics: reduce window for too-frequent-captures bug and document it
* fixup: provide mocked clock to flow metric
* Flow metrics cleanup (#14535)
* flow metrics: code-style and readability pass
* remove unused imports
* cleanup: simplify usage of internal helpers
* flow: migrate internals to use OptionalDouble
* Flow metrics global (#14539)
* flow: add global top-level flows
* docs: add flow metrics
* Top level flow metrics unit tests added. (#14540)
* Top level flow metrics unit tests added.
* Add unit tests when config reloads, make sure top-level flow metrics didn't get reset.
* Apply suggestions from code review
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* Validating against Hash test cases updated.
* For the safety check against exact type in unit tests.
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
* docs: section links and clarity in node stats API flow metrics
Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Mashhur <mashhur.sattorov@gmail.com>
* Adds tasks to add bundled JDK to tar file used to run integration tests
* Uses `RUNTIME_JAVA_HOME` environment variable to control whether bundled JDK or
alternative is to be used
* Updates logstash service helper to respect value of `RUNTIME_JAVA_HOME`
* Requires updates to jenkins repo to set `RUNTIME_JAVA_HOME` correctly only for
integration tests that expect to use a custom version of Java, such as the JDK
matrix tests.
* Fix version of java used to retrieve logstash version in integration tests
Prior to this commit, the system java would be used to retrieve logstash
version in integration tests, leading to test failures with IT environments
that have java 1.8 as system java
* Actually fix `test_port` this time
Use bash `/dev/tcp` to test ports rather than attempting to use `nc` and
`ruby`
When running certain integration tests, a test against a given port is
performed to ensure that certain dependent services are up. Currently,
these are tests are done either via `nc` or `ruby` if no `nc` is provisioned
on the build nodes. The current `ruby` implenentation attempts to use a system
ruby before using the ruby script shipped with Logstash. This commit removes the
use of the system jruby - certain build boxes are still using java8 as their system
java, which causes builds to fail, as java 11 is expected
Open the ability to use Ruby codec inside Java plugins.
Java plugins need subclasses of Java `co.elastic.logstash.api.Codec` class to properly work. This PR implements an adapter for Ruby codecs to be wrapped into a Java's Codec subclass.
Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
This commit replaces the use of a block with a lambda as an argument for Stream.forEach.
This is to work around the jruby issue identified in https://github.com/jruby/jruby/issues/7246.
This commit also updates the multiple_pipeline_spec to update the test case for pipeline->pipeline
communication to trigger the issue - it only occurs with Streams with more than one event in it.
When run in debug mode, #invoke was returning an instance of UI::Shell rather
than a string, causing the plugin to crash when `<<` was called on.
This commit ensures that a string is returned regardless of whether debug is set
Fixes: #14131
This commit updates the version of jruby used in Logstash to `9.3.4.0`.
* Updates the references of `jruby` from `9.2.20.1` to `9.3.4.0`
* Updates references/locations of ruby from `2.5.0` to `2.6.0`
* Updates java imports including `org.logstash.util` to be quoted
* Without quoting the name of the import, the following error is observed in tests:
* `java.lang.NoClassDefFoundError: org/logstash/Util (wrong name: org/logstash/util)`
* Maybe an instance of https://github.com/jruby/jruby/issues/4861
* Adds a monkey patch to `require` to resolve compatibility issue between latest `jruby` and `polyglot` gem
* The addition of https://github.com/jruby/jruby/pull/7145 to disallow circular
causes, will throw when `polyglot` is thrown into the mix, and stop logstash from
starting and building - any gems that use an exception to determine whether or not
to load the native gem, will trigger the code added in that commit.
* This commit adds a monkey patch of `require` to rollback the circular cause exception
back to the original cause.
* Removes the use of the deprecated `JavaClass`
* Adds additional `require time` in `generate_build_metadata`
* Rewrites a test helper to avoid potentially calling `~>` on `FalseClass`
Co-authored-by: Joao Duarte <jsvduarte@gmail.com>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Prior to this commit, attempting to use a custom java plugin installed from rubygems would
fail as the `files` field used to find jar files in the gem is not populated for such gems
This commit uses the `full_require_path` to find the jar file for a native jar plugin,
which should work regardless of where the gem was installed from, locally or via rubygems
Additionally, this commit will only attempt to locate the jar file inside the gem when it
is required (only when the experimental separate plugin class loader is used)
This commit builds on the work done in #13888
Relates: #13888, #11305
Co-authored-by: twosom <two_somang@icloud.com>
Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com>
Updates the `ConfigVariableExpander.expand` to selectively create `SecretVariable` instances for SecretStore resolved environment variables.
`SecretVariable` instances in if statements are decrypted during `eq` `EventCondition` compilation; bringing the secret value and using in the comparator.
Removes the installation of sys-v init.d scripts from .deb and .rpms.
All supported operating systems use systemd as the init system, as such, it is no longer necessary to continue to ship Sys-v Init.d scripts.
Sets the LS_JAVA_HOME environment variable for the environments used to spawn Logstash process in integration tests.
The JDK matrix testing is based on selecting the desired JDK to run the tests, through the BUILD_JAVA_HOME.
However, when the integration tests spawn a Logstash process this setting was missed.