Reject illegal value assigning to `tags` field. Top-level `tags` should only accept string of array of string.
When `tags` got illegal value on event creation, LogStash::Event will rename the field to `_tags` and add a tag `_tagsparsefailure` to `tags`.
When `tags` got illegal value on `set` operation, LogStash::Event will throw exception.
Add a flag `--event_api.tags.illegal` to allow fallback to old logic. There are two options.
`warn` - the old flow that allows illegal value assignment to tags field.
`rename` - the new flow. This is the default value in 8.7
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>
Co-authored-by: João Duarte <jsvd@users.noreply.github.com>
Logstah currently does not support multiple top-level `codec` per plugin. This commit fixes the pipeline compilation ensuring that erroneously configured plugins fail to compile and result in a configuration error.
* source/multilocal: fix detection of empty pipelines.yml
Fixes a regression introduced in elastic/logstash#13883 in which the presence
of an empty `pipelines.yml` file produces an error message indicating that
the file cannot be read.
When either `YAML::load` or `YAML::safe_load` encounter an effectively-empty
payload (such as one that is entirely comments), they use a `fallback` param
to determine what value to emit, with the former emitting `false` and the
latter emitting `nil`.
This is problematic because a _separate_ blind-`rescue nil` causes `nil` to
be bound to the MultiLocal's `@detected_marker`, and we assume that a `nil`
value in the marker means that there was an exception reading the file (such
as a permissions issue or parse failure).
By providing a `fallback: false` directive when parsing the contents, we
ensure that an empty file is reported as such.
* source/multilocal: avoid `rescue nil` that loses helpful context
When the pipelines yaml cannot be read, or can be read but fails to parse,
the MultiLocal#read_pipelines_from_yaml emits a helpful exception including
specifics about why it failed to load or parse, but a blind `rescue nil`
here causes that helpful information to be lost.
When pipeline detection is exceptional, hold onto the helpful exception
so that it can be reported along with the config conflicts.
* source/multilocal: differentiate between reading and parsing failure
* source/multilocal: use translations for conflict messages
* source/multilocal: specs for error conditions
* add failing tests for Event.new with field that look like field references
* fix: correctly handle FieldReference-special characters in field names.
Keys passed to most methods of `ConvertedMap`, based on `IdentityHashMap`
depend on identity and not equivalence, and therefore rely on the keys being
_interned_ strings. In order to avoid hitting the JVM's global String intern
pool (which can have performance problems), operations to normalize a string
to its interned counterpart have traditionally relied on the behaviour of
`FieldReference#from` returning a likely-cached `FieldReference`, that had
an interned `key` and an empty `path`.
This is problematic on two points.
First, when `ConvertedMap` was given data with keys that _were_ valid string
field references representing a nested field (such as `[host][geo][location]`),
the implementation of `ConvertedMap#put` effectively silently discarded the
path components because it assumed them to be empty, and only the key was
kept (`location`).
Second, when `ConvertedMap` was given a map whose keys contained what the
field reference parser considered special characters but _were NOT_
valid field references, the resulting `FieldReference.IllegalSyntaxException`
caused the operation to abort.
Instead of using the `FieldReference` cache, which sits on top of objects whose
`key` and `path`-components are known to have been interned, we introduce an
internment helper on our `ConvertedMap` that is also backed by the global string
intern pool, and ensure that our field references are primed through this pool.
In addition to fixing the `ConvertedMap#newFromMap` functionality, this has
three net effects:
- Our ConvertedMap operations still use strings
from the global intern pool
- We have a new, smaller cache of individual field
names, improving lookup performance
- Our FieldReference cache no longer is flooded
with fragments and therefore is more likely to
remain performant
NOTE: this does NOT create isolated intern pools, as doing so would require
a careful audit of the possible code-paths to `ConvertedMap#putInterned`.
The new cache is limited to 10k strings, and when more are used only
the FIRST 10k strings will be primed into the cache, leaving the
remainder to always hit the global String intern pool.
NOTE: by fixing this bug, we alow events to be created whose fields _CANNOT_
be referenced with the existing FieldReference implementation.
Resolves: https://github.com/elastic/logstash/issues/13606
Resolves: https://github.com/elastic/logstash/issues/11608
* field_reference: support escape sequences
Adds a `config.field_reference.escape_style` option and a companion
command-line flag `--field-reference-escape-style` allowing a user
to opt into one of two proposed escape-sequence implementations for field
reference parsing:
- `PERCENT`: URI-style `%`+`HH` hexadecimal encoding of UTF-8 bytes
- `AMPERSAND`: HTML-style `&#`+`DD`+`;` encoding of decimal Unicode code-points
The default is `NONE`, which does _not_ proccess escape sequences.
With this setting a user effectively cannot reference a field whose name
contains FieldReference-reserved characters.
| ESCAPE STYLE | `[` | `]` |
| ------------ | ------- | ------- |
| `NONE` | _N/A_ | _N/A_ |
| `PERCENT` | `%5B` | `%5D` |
| `AMPERSAND` | `[` | `]` |
* fixup: no need to double-escape HTML-ish escape sequences in docs
* Apply suggestions from code review
Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
* field-reference: load escape style in runner
* docs: sentences over semiciolons
* field-reference: faster shortcut for PERCENT escape mode
* field-reference: escape mode control downcase
* field_reference: more s/experimental/technical preview/
* field_reference: still more s/experimental/technical preview/
Co-authored-by: Karol Bucek <kares@users.noreply.github.com>
A number of plugins reach into Logstash's i18n translations to "helpfully"
communicate certain configuration errors, but rely on translations that were
moved in 00a99c19e5 from logstash.agent to
logstash.runner. Since then, it is possible to hit obtuse error messages about
failing to load a translation instead of the intended helpful message:
~~~
translation missing: en.logstash.agent.configuration.invalid_plugin_register
~~~
By moving the `logstash.agent` definition to _after_ the `logstash.runner`
definition, we can use YAML tooling to name the `logstash.runner.configuration`
node and then merge its contents into `logstash.agent.configuration`. This
effectively allows us to keep a single definition of those translations while
making them available at both addresses.
Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>
* ecs: report pipeline's ECS-compatibility with INFO at startup
Because the pipeline-level setting `pipeline.ecs_compatibility` affects the
default behaviour of nearly every plugin in the pipeline, an INFO-level log
message will provide useful hints, especially to our users who upgrade to
Logstash 8 without first reading the breaking changes docs.
For example, when we have two pipelines `old` and `new` whose `pipeline.ecs_compatibility` is `disabled` and `v8` respectively, we would get the following log messages:
> ~~~
> [2021-11-04T18:43:21,810][INFO ][logstash.javapipeline ] Pipeline `old` is configured with `pipeline.ecs_compatibility: disabled` setting. All plugins in this pipeline will default to `ecs_compatibility => disabled` unless explicitly configured otherwise.
> [2021-11-04T18:43:21,817][INFO ][logstash.javapipeline ] Pipeline `new` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
> ~~~
* ecs: make v8 the default for 8.0
* ecs: `pipeline.ecs_compatibility` defaults to `v8`
Related: elastic/logstash#11623
* doc: temporarily remove deep link from breaking changes doc to fix build
* Forwardport #13358 to master: Add warnings for JAVA_HOME/older versions of Java
Forwardport PR #13358 to master branch. Original message:
* Add deprecation warnings for JAVA_HOME/older versions of Java
Logstash 8.0 will remove support for java versions before java 11, this commit
adds entries to the deprecation log warning against this.
Also adds use of `JAVA_HOME` to the deprecation log.
* Change deprecation log entries to warnings to indicate imminent issue
* settings: add "deprecated alias" support
A deprecated alias provides a path for renaming a setting.
- When a deprecated alias is set on its own, a deprecation notice is emitted
but fetching the canonical setting value will reflect the value set with the
deprecated alias.
- When both the canonical setting (new name) and the deprecated alias (old
name) are specified, it is an error condition.
- When the value of the deprecated alias is queried, a warning is emitted to
the logger and only the value explicitly set to the deprecated alias is
returned.
Additionally, some relevant cleanup is also included:
- Starting Logstash with invalid settings no longer results in the obtuse "An
unexpected error occurred" with backtrace and exception data obscuring the
issue. Instead, a simple message is emitted indicating that the settings are
invalid along with the originating exception's message.
- The various settings implementations share a common logger, instead of each
implementation class providing its own. This is aimed to reduce noise from
the logs and to ensure specs validating logging do not need to tie so
closely to implementation details.
* settings: add password-wrapped setting
* settings: make any setting type capable of being nullable
* settings: add `Settings#names` to power programatic iteration
* cli: route CLI-flag deprecations in to deprecation logger
* settings: group API-related settings under `api.*`
retains deprecated aliases, and is fully backward-compatible.
* webserver: cleanup orphaned attr accessors for never-set ivars
* api: pull settings extraction down from agent
This net-no-change refactor introduces a new method `WebServer#from_settings`
that bridges the gap between Logstash settings and Puma-related options, so
that future additions to the API settings don't add complexity to the Agent.
It also has the benefit of initializing the API Rack App and just ONCE, instead
of once per attempted HTTP port.
* api: add optional TLS/SSL
* docs: reference API security settings
* api: when configured securely, bind to all available interfaces by default
* cleanup: remove unused cert artifacts
* tests: generate fresh webserver certificates
* certs: actually add the binary keystores 🤦
This PR enables the upgrade of bundler to the latest version.
Prior to this PR, the ability to do so was blocked by bundler.setup in versions of bundler > `2.23` making runtime changes to `Gemfile.lock` (unless the lock file was `frozen`) based on the specific platform the application was being run on, overriding any platforms (including generic `java` platform) set during build time. This was in conflict with changes made in #12782, which prevented the logstash user writing to files in `/usr/share/logstash`.
This PR will freeze the lockfile when logstash is run, and unfreeze it when manipulating plugins (install, update, remove, install from offline pack) to allow new plugins to be added. While unfrozen, changes are also made to ensure that the platform list remains as the generic `java` platform, and not changed to the specific platform for the runtime JVM.
This PR also introduces a new runtime flag, `--enable-local-plugin-development`. This flag is intended for use by Logstash developers only, and enables a mode of operation where a Gemfile can be manipulated, eg
```
gem "logstash-integration-kafka", :path => '/users/developer/code/plugins/logstash-integration-kafka'
```
to facilitate quick and simple plugin testing.
This PR also sets the `silence_root_warning` flag to avoid bundler printing out alarming looking warning messages when `sudo` is used. This warning message was concerning for users - it would be printed out during normal operation of `bin/logstash-plugin install/update/remove` when run under `sudo`, which is the expected mode of operation when logstash is installed to run as a service via rpm/deb packages.
This PR also updates the vagrant based integration tests to ensure that Logstash still runs after plugin update/install/remove operations, fixes up some regular expressions that would cause test failures, and removes some dead code from tests.
## Release notes
* Updated Bundler to latest version
* Ensured that `Gemfile.lock` are appropriately frozen
* Added new developer-only flag to facilitate local plugin development to allow unfrozen lockfile in a development environment
As we close in on the availability of 8.0.0 alphas, we are reassessing which
breaking changes are _necessary_, and which are merely _desired_. And while
we would love to be in a world where ECS was on by default, and have put
substantial effort into designing an upgrade path that would be as simple as
possible, we have determined that the time may not be right to change the
default value out of under our users.
This change restores the default value for `pipeline.ecs_compatibility` to
`disabled`, ensuring pipelines will continue running in Logstash 8 as they have
in Logstash 7 without modification. We will still encourage our users to be
explicit about which behaviour they desire, and will revisit making ECS on by
default at a later date.
Implements a plugin `ecs_compatibility` option, whose default value is powered
by the pipeline-level setting `pipeline.ecs_compatibility`, in line with the
proposal in elastic/logstash#11623:
In order to increase the confidence a user has when upgrading Logstash, this
implementation uses the deprecation logger to warn when `ecs_compatibility` is
used without an explicit directive.
For now, as we continue to add ECS Compatibility Modes, an opting into a
specific ECS Compatibility mode at a pipeline level is considered a BETA
feature. All plugins using the [ECS Compatibility Support][] adapter will
use the setting correctly, but pipelines configured in this way do not
guarantee consistent behaviour across minor versions of Logstash or the
plugins it bundles (e.g., upgraded plugins that have newly-implemented an ECS
Compatibility mode will use the pipeline-level setting as a default, causing
them to potentially behave differently after the upgrade).
This change-set also includes a significant amount of work within the
`PluginFactory`, which allows us to ensure that pipeline-level settings are
available to a Logstash plugin _before_ its `initialize` is executed,
including the maintaining of context for codecs that are routinely cloned.
* JEE: instantiate codecs only once
* PluginFactory: use passed FilterDelegator class
* PluginFactory: require engine name in init
* NOOP: remove useless secondary plugin factory interface
* PluginFactory: simplify, compute java args only when necessary
* PluginFactory: accept explicit id when vertex unavailable
* PluginFactory: make source optional, args required
* PluginFactory: threadsafe refactor of id duplicate tracking
* PluginFactory: make id extraction/geration more abstract/understandable
* PluginFactory: extract or generate ID when source not available
* PluginFactory: inject ExecutionContext before initializing plugins
* Codec: propagate execution_context and metric to clones
* Plugin: intercept string-specified codecs and propagate execution_context
* Plugin: implement `ecs_compatibility` for all plugins
* Plugin: deprecate use of `Config::Mixin::DSL::validate_value(String, :codec)`
In some workflows such as simple file manipulation, starting a webserver is
unnecessary overhead, and we should be able to avoid it.
Here we introduce a new parameter `http.enabled`, which defaults to `true` to
maintain the existing functionality.
Resolves: elastic/logstash#9408
Closes: elastic/logstash#11525
Co-authored-by: Benoit Dupont <benoit.dupont@gmail.com>
Fixes#11533
reuse rubyArray for single element batches
rename preserveBatchOrder to preserveEventOrder
allow boolean and string values for the pipeline.ordered setting, reorg validation
update docs
yml typo
Update docs/static/running-logstash-command-line.asciidoc
Co-Authored-By: Karen Metts <35154725+karenzone@users.noreply.github.com>
Update docs/static/running-logstash-command-line.asciidoc
Co-Authored-By: Karen Metts <35154725+karenzone@users.noreply.github.com>
java execution specs and spec support
docs corrections per review
typo
close not shutdown
Ruby pipeline spec
* noop: swap expected/actual in tests so failures read correctly
* field_reference: fixes crash and adds many edge-case test cases
Some pathological inputs can cause `FieldReference#parse(CharSequence)` to
throw an `ArrayIndexOutOfBounds` exception, which hard-crashes the entire
Logstash process because this Java-based exception is not caught by the Ruby
plugins.
Instead, proactively throw a new runtime exception indicating that we
encountered illegal syntax, and adjust the ruby event wrapper to re-throw
a ruby RuntimeException that can be handled in the normal way.
Additionally, this commit adds specs for the current behaviour of a wide
variety of illegal-syntax inputs, to ensure we don't accidentally change
the current behaviour.
* field_reference: break legacy tokenizer out, lay groundwork for strict-mode
Extracts the bulk of the current `FieldReference#parse(CharSequence)` method
out into its own `LegacyTokenizer` class verbatim to simplify subsequent
review as we add a strict-mode tokenizer.
* field_reference: add strict-mode and compat-mode
Adds a strict-mode FieldReference tokenizer, which throws an illegal syntax
exception when it encounters illegal syntax.
Adds a compat-mode FieldReference tokenizer, which _warns_ when it encounters
an input that would produce a different output under the strict-mode parser,
but otherwise behaves the same as the legacy-mode parser
Enables users to opt-in to the strict-mode parsing, or to opt-out of the
warnings using a command line flag or a settings-file directive
* field_reference (BREAKING): remove LEGACY and COMPAT for 7.0
This will allow users to override the pipeline id from the default, "main", to something else while running pipelines via either the -e or -f options.
Fixes#8868
* add newlines to generated json
* Implement cloud.id and cloud.auth settings merge to module settings
* Fixes from review plus convert to using Password for any Module Setting
* Review changes
* update modules.asciidoc to include a section on Cloud
* Capitalize Id
* remove unnecessesary require lines
* Add --setup phase for modules
Without the `--setup` flag, the index template for elasticsearch, and the index-pattern, saved searches, visualizations, and dashboards for Kibana will not be published.
fixes#7959
The hot threads endpoint used to require thread names to be unique in a map and would crash otherwise. This patch fixes this by internally using a List instead of a Map to store these thread ids.
Additionally, this patch adds the `thread_id` to the response to allow for disambiguation of of threads for any client applications.
Fixes https://github.com/elastic/logstash/issues/7608Fixes#7617
* Add support for using kibana dashboard import api
- Add support for versioned folders below 'module'/configuration/kibana
- Rename cef to module_test_files/tester and simplify it.
- Remove skips on tests using above modules test files.
- In pipeline.rb, moved method default_logging_keys to the base class,
it was being called from BasePipeline during a config-test.
- In modules_common, added a backtrace to LogStash::ConfigLoadingError
for better debugging
- Improve layout when user_lines et al were empty in logstash_config
* Change specificity of maj/min/patch folder search
* Use logger to surface manticore error
* Add suppor for Kibana auth
* improve error message
* support es and kibana authentication in the logstash modules
* Update runner, source/local, source/modules, source/multi_local
improve startup config problem detection across all sources.
* fix multi_local spec to use not `empty` values.
* My changes (#7218)
* First upstream PR commit (#7172)
No tests yet. Just for code review for now
* move all inner classes to their own folder + client and importer
* Fixes and tests (#7228)
Add tests for the `LogStash::Modules:CLIParser` class in `cli_parser.rb`
Fix a typo in `cli_parser.rb` (`uparsed` vs `unparsed`)
Fix a bad variable name found by testing in `cli_parser.rb` and update the error message accordingly in `en.yml`
* Remove fb_modules (#7280)
* fixes to import index-pattern & var updates & savedsearch capability (#7283)
* fixes to import index-pattern & var updates & savedsearch capability
fixes to import index-pattern & var updates
add savedsearch capability
* minimise merge conflicts with PR End-to-End test with filebeat apache2
* End-to-End test with filebeat apache2 (#7279)
This is a first run, but data flows from filebeat through Elasticsearch.
Template uploads from `$LS_HOME/modules/MODULENAME/configuration/elasticsearch/MODULENAME.json`
Specifying `--modules filebeat` from the command-line, with `-M "filebeat.var.elasticsearch.output.host=localhost:9200"`
Some of the saved searches don't get uploaded. @guyboertje is on this already.
The logstash configuration needs tweaking to allow receiving both access logs _and_ error logs. The dashboards and visualizations all seem to expect the presence of both.
Set default to `localhost` in `elasticsearch_client.rb`
Changed command-line variable parsing to allow for a variable with only `modulename.key.subkey=value`, and updated the error message accordingly.
First draft of the filebeat module, as extracted from filebeat 5.4.0
* Add documentation for Modules
This is specific to the Master branch. Multiple modules will not be supported in 5.5.
* Add READMEs and prune post-code comments
* Add comment regarding the variable name `modul`
Also, fix the default username for the Elasticsearch output in Logstash. The default x-pack credentials are `elastic:changeme` rather than `elasticsearch:changeme`
* add cef module files (#7292)
* fixes from reviews of PR #7284
* add multi_local source for multi pipelines
* introduce pipelines.yml
* introduce PipelineSettings class
* support reloading of pipeline parameters
* fix pipeline api call for _node/pipelines
* inform user pipelines.yml is ignored if -e or -f is enabled
This PR introduces majors changes inside Logstash, to help build future
features like supporting multiple pipeline or java execution.
The previous implementation of the agent made the class hard to refactor or add new feature to it, because it needed to
know about too much internal of the pipeline and how the configuration existed.
The changes includes:
- Externalize the loading of the configuration file using a `SourceLoader`
- The source loader can support multiple sources and will aggregate them.
- We keep some metadata about the original file so the LIR can give better feedback.
- The Agent now ask the `SourceLoader` to know which configuration need to be run
- The Agent now uses a converge state strategy to handle: start, reload, stop
- Each actions executed on the pipeline are now extracted into their own classes to help with migration to a new pipeline execution.
- The pipeline now has a start method that handle the tread
- Better out of the box support for multiple pipeline (monitoring)
- Refactor of the spec of the agent
Fixes#6632
When a plugin is loaded using the `plugins.path` option or is from a
universal plugin there no gemspec can be found for the specific plugin.
We should not print any warning on that case.
Fixes: #6444Fixes#6448
This adds two new fields 'id', and 'name' to the base metadata for API requests.
These fields are now returned at all API endpoints by default.
The `id` field is the persisted UUID, the name field is the custom name
the user has passed in (defaulted to the hostname).
I renamed `node_uuid` and `node_name` to just `id` and `name` to be
inline with Elasticsearch.
This also fixed a broken test double in `webserver_spec.rb` that was
using doubles across threads which created hidden errors.
Fixes#6224
This PR does a few modifications for our webserver:
- Add a PortRange setting type.
- allow the --http.port option to accept a port range or a single port, when a range is provided Logstash will incrementally try this list to find a free port for the webserver.
- Logstash will report in the log which port it is using. (INFO LEVEL)
- It refactors a few methods of the webserver.
- It adds test for the binding of the ports.
Notes:
Port range can be defined in the logstash.yml or pass via the command line like this.
`bin/logstash -e 'input { generator {}} output { null {}}' --log.level verbose --http.port 2000-2005`
Fixes#5774