logstash

mirror of https://github.com/elastic/logstash.git synced 2025-04-24 06:37:19 -04:00

Author	SHA1	Message	Date
Mashhur	aa4d99b7e3	[8.x][Docs] Add recommendations to collect all metrics for the best dashboard experience. (#17524 ) * [Docs] Add recommendations to collect all metrics for the best dashboard experience. * Indentation alignment.	2025-04-09 14:32:27 -07:00
Mashhur	38e0ca171a	Remove technical preview from agent driven monitoring pages. (#17485 )	2025-04-05 13:08:58 -07:00
github-actions[bot]	7341ff6e2f	Add pipeline metrics to Node Stats API (#16839 ) (#16850 ) This commit introduces three new metrics per pipeline in the Node Stats API: - workers - batch_size - batch_delay ``` { ... pipelines: { main: { events: {...}, flow: {...}, plugins: {...}, reloads: {...}, queue: {...}, pipeline: { workers: 12, batch_size: 125, batch_delay: 5, }, } } ... } ``` (cherry picked from commit `de6a6c5b0f`) Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com>	2025-01-03 20:53:51 +00:00
Ry Biesemeyer	7eb5185b4e	Feature: health report api (#16520 ) * [health] bootstrap HealthObserver from agent to API (#16141) * [health] bootstrap HealthObserver from agent to API * specs: mocked agent needs health observer * add license headers * Merge `main` into `feature/health-report-api` (#16397) * Add GH vault plugin bot to allowed list (#16301) * regenerate webserver test certificates (#16331) * correctly handle stack overflow errors during pipeline compilation (#16323) This commit improves error handling when pipelines that are too big hit the Xss limit and throw a StackOverflowError. Currently the exception is printed outside of the logger, and doesn’t even show if log.format is json, leaving the user to wonder what happened. A couple of thoughts on the way this is implemented: * There should be a first barrier to handle pipelines that are too large based on the PipelineIR compilation. The barrier would use the detection of Xss to determine how big a pipeline could be. This however doesn't reduce the need to still handle a StackOverflow if it happens. * The catching of StackOverflowError could also be done on the WorkerLoop. However I'd suggest that this is unrelated to the Worker initialization itself, it just so happens that compiledPipeline.buildExecution is computed inside the WorkerLoop class for performance reasons. So I'd prefer logging to not come from the existing catch, but from a dedicated catch clause. Solves #16320 * Doc: Reposition worker-utilization in doc (#16335) * settings: add support for observing settings after post-process hooks (#16339) Because logging configuration occurs after loading the `logstash.yml` settings, deprecation logs from `LogStash::Settings::DeprecatedAlias#set` are effectively emitted to a null logger and lost. By re-emitting after the post-process hooks, we can ensure that they make their way to the deprecation log. This change adds support for any setting that responds to `Object#observe_post_process` to receive it after all post-processing hooks have been executed. Resolves: elastic/logstash#16332 * fix line used to determine ES is up (#16349) * add retries to snyk buildkite job (#16343) * Fix 8.13.1 release notes (#16363) make a note of the fix that went to 8.13.1: #16026 Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> * Update logstash_releases.json (#16347) * [Bugfix] Resolve the array and char (single \| double quote) escaped values of ${ENV} (#16365) * Properly resolve the values from ENV vars if literal array string provided with ENV var. * Docker acceptance test for persisting keys and use actual values in docker container. * Review suggestion. Simplify the code by stripping whitespace before `gsub`, no need to check comma and split. Co-authored-by: João Duarte <jsvd@users.noreply.github.com> --------- Co-authored-by: João Duarte <jsvd@users.noreply.github.com> * Doc: Add SNMP integration to breaking changes (#16374) * deprecate java less-than 17 (#16370) * Exclude substitution refinement on pipelines.yml (#16375) * Exclude substitution refinement on pipelines.yml (applies on ENV vars and logstash.yml where env2yaml saves vars) * Safety integration test for pipeline config.string contains ENV . * Doc: Forwardport 8.15.0 release notes to main (#16388) * Removing 8.14 from ci/branches.json as we have 8.15. (#16390) --------- Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com> Co-authored-by: João Duarte <jsvd@users.noreply.github.com> Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> Co-authored-by: Andrea Selva <selva.andre@gmail.com> Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> * Squashed merge from 8.x * Failure injector plugin implementation. (#16466) * Test purpose only failure injector integration (filter and output) plugins implementation. Add unit tests and include license notes. * Fix the degrate method name typo. Co-authored-by: Andrea Selva <selva.andre@gmail.com> * Add explanation to the config params and rebuild plugin gem. --------- Co-authored-by: Andrea Selva <selva.andre@gmail.com> * Health report integration tests bootstrapper and initial tests implementation (#16467) * Health Report integration tests bootstrapper and initial slow start scenario implementation. * Apply suggestions from code review Renaming expectation check method name. Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com> * Changed to branch concept, YAML structure simplified as changed to Dict. * Apply suggestions from code review Reflect `help_url` to the integration test. --------- Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com> * health api: expose `GET /_health_report` with pipelines//status probe (#16398) Adds a `GET /_health_report` endpoint with per-pipeline status probes, and wires the resulting report status into the other API responses, replacing their hard-coded `green` with a meaningful status indication. --------- Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> docs: health report API, and diagnosis links (feature-targeted) (#16518) * docs: health report API, and diagnosis links * Remove plus-for-passthrough markers Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> --------- Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> * merge 8.x into feature branch... (#16519) * Add GH vault plugin bot to allowed list (#16301) * regenerate webserver test certificates (#16331) * correctly handle stack overflow errors during pipeline compilation (#16323) This commit improves error handling when pipelines that are too big hit the Xss limit and throw a StackOverflowError. Currently the exception is printed outside of the logger, and doesn’t even show if log.format is json, leaving the user to wonder what happened. A couple of thoughts on the way this is implemented: * There should be a first barrier to handle pipelines that are too large based on the PipelineIR compilation. The barrier would use the detection of Xss to determine how big a pipeline could be. This however doesn't reduce the need to still handle a StackOverflow if it happens. * The catching of StackOverflowError could also be done on the WorkerLoop. However I'd suggest that this is unrelated to the Worker initialization itself, it just so happens that compiledPipeline.buildExecution is computed inside the WorkerLoop class for performance reasons. So I'd prefer logging to not come from the existing catch, but from a dedicated catch clause. Solves #16320 * Doc: Reposition worker-utilization in doc (#16335) * settings: add support for observing settings after post-process hooks (#16339) Because logging configuration occurs after loading the `logstash.yml` settings, deprecation logs from `LogStash::Settings::DeprecatedAlias#set` are effectively emitted to a null logger and lost. By re-emitting after the post-process hooks, we can ensure that they make their way to the deprecation log. This change adds support for any setting that responds to `Object#observe_post_process` to receive it after all post-processing hooks have been executed. Resolves: elastic/logstash#16332 * fix line used to determine ES is up (#16349) * add retries to snyk buildkite job (#16343) * Fix 8.13.1 release notes (#16363) make a note of the fix that went to 8.13.1: #16026 Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> * Update logstash_releases.json (#16347) * [Bugfix] Resolve the array and char (single \| double quote) escaped values of ${ENV} (#16365) * Properly resolve the values from ENV vars if literal array string provided with ENV var. * Docker acceptance test for persisting keys and use actual values in docker container. * Review suggestion. Simplify the code by stripping whitespace before `gsub`, no need to check comma and split. Co-authored-by: João Duarte <jsvd@users.noreply.github.com> --------- Co-authored-by: João Duarte <jsvd@users.noreply.github.com> * Doc: Add SNMP integration to breaking changes (#16374) * deprecate java less-than 17 (#16370) * Exclude substitution refinement on pipelines.yml (#16375) * Exclude substitution refinement on pipelines.yml (applies on ENV vars and logstash.yml where env2yaml saves vars) * Safety integration test for pipeline config.string contains ENV . * Doc: Forwardport 8.15.0 release notes to main (#16388) * Removing 8.14 from ci/branches.json as we have 8.15. (#16390) * Increase Jruby -Xmx to avoid OOM during zip task in DRA (#16408) Fix: #16406 * Generate Dataset code with meaningful fields names (#16386) This PR is intended to help Logstash developers or users that want to better understand the code that's autogenerated to model a pipeline, assigning more meaningful names to the Datasets subclasses' fields. Updates `FieldDefinition` to receive the name of the field from construction methods, so that it can be used during the code generation phase, instead of the existing incremental `field%n`. Updates `ClassFields` to propagate the explicit field name down to the `FieldDefinitions`. Update the `DatasetCompiler` that add fields to `ClassFields` to assign a proper name to generated Dataset's fields. * Implements safe evaluation of conditional expressions, logging the error without killing the pipeline (#16322) This PR protects the if statements against expression evaluation errors, cancel the event under processing and log it. This avoids to crash the pipeline which encounter a runtime error during event condition evaluation, permitting to debug the root cause reporting the offending event and removing from the current processing batch. Translates the `org.jruby.exceptions.TypeError`, `IllegalArgumentException`, `org.jruby.exceptions.ArgumentError` that could happen during `EventCodition` evaluation into a custom `ConditionalEvaluationError` which bubbles up on AST tree nodes. It's catched in the `SplitDataset` node. Updates the generation of the `SplitDataset `so that the execution of `filterEvents` method inside the compute body is try-catch guarded and defer the execution to an instance of `AbstractPipelineExt.ConditionalEvaluationListener` to handle such error. In this particular case the error management consist in just logging the offending Event. --------- Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> * Update logstash_releases.json (#16426) * Release notes for 8.15.1 (#16405) (#16427) * Update release notes for 8.15.1 * update release note --------- Co-authored-by: logstashmachine <43502315+logstashmachine@users.noreply.github.com> Co-authored-by: Kaise Cheng <kaise.cheng@elastic.co> (cherry picked from commit `2fca7e39e8`) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Fix ConditionalEvaluationError to do not include the event that errored in its serialiaxed form, because it's not expected that this class is ever serialized. (#16429) (#16430) Make inner field of ConditionalEvaluationError transient to be avoided during serialization. (cherry picked from commit `bb7ecc203f`) Co-authored-by: Andrea Selva <selva.andre@gmail.com> * use gnu tar compatible minitar to generate tar artifact (#16432) (#16434) Using VERSION_QUALIFIER when building the tarball distribution will fail since Ruby's TarWriter implements the older POSIX88 version of tar and paths will be longer than 100 characters. For the long paths being used in Logstash's plugins, mainly due to nested folders from jar-dependencies, we need the tarball to follow either the 2001 ustar format or gnu tar, which is implemented by the minitar gem. (cherry picked from commit `69f0fa54ca`) Co-authored-by: João Duarte <jsvd@users.noreply.github.com> * account for the 8.x in DRA publishing task (#16436) (#16440) the current DRA publishing task computes the branch from the version contained in the version.yml This is done by taking the major.minor and confirming that a branch exists with that name. However this pattern won't be applicable for 8.x, as that branch currently points to 8.16.0 and there is no 8.16 branch. This commit falls back to reading the buildkite injected BUILDKITE_BRANCH variable. (cherry picked from commit `17dba9f829`) Co-authored-by: João Duarte <jsvd@users.noreply.github.com> * Fixes the issue where LS wipes out all quotes from docker env variables. (#16456) (#16459) * Fixes the issue where LS wipes out all quotes from docker env variables. This is an issue when running LS on docker with CONFIG_STRING, needs to keep quotes with env variable. * Add a docker acceptance integration test. (cherry picked from commit `7c64c7394b`) Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> * Known issue for 8.15.1 related to env vars references (#16455) (#16469) (cherry picked from commit `b54caf3fd8`) Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co> * bump .ruby_version to jruby-9.4.8.0 (#16477) (#16480) (cherry picked from commit `51cca7320e`) Co-authored-by: João Duarte <jsvd@users.noreply.github.com> * Release notes for 8.15.2 (#16471) (#16478) Co-authored-by: andsel <selva.andre@gmail.com> Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> (cherry picked from commit `01dc76f3b5`) * Change LogStash::Util::SubstitutionVariables#replace_placeholders refine argument to optional (#16485) (#16488) (cherry picked from commit `8368c00367`) Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com> * Use jruby-9.4.8.0 in exhaustive CIs. (#16489) (#16491) (cherry picked from commit `fd1de39005`) Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> * Don't use an older JRuby with oraclelinux-7 (#16499) (#16501) A recent PR (elastic/ci-agent-images/pull/932) modernized the VM images and removed JRuby 9.4.5.0 and some older versions. This ended up breaking exhaustive test on Oracle Linux 7 that hard coded JRuby 9.4.5.0. PR https://github.com/elastic/logstash/pull/16489 worked around the problem by pinning to the new JRuby, but actually we don't need the conditional anymore since the original issue https://github.com/jruby/jruby/issues/7579#issuecomment-1425885324 has been resolved and none of our releasable branches (apart from 7.17 which uses `9.2.20.1`) specify `9.3.x.y` in `/.ruby-version`. Therefore, this commit removes conditional setting of JRuby for OracleLinux 7 agents in exhaustive tests (and relies on whatever `/.ruby-version` defines). (cherry picked from commit `07c01f8231`) Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com> * Improve pipeline bootstrap error logs (#16495) (#16504) This PR adds the cause errors details on the pipeline converge state error logs (cherry picked from commit `e84fb458ce`) Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com> * Logstash Health Report Tests Buildkite pipeline setup. (#16416) (#16511) (cherry picked from commit `5195332bc6`) Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> * Make health report test runner script executable. (#16446) (#16512) (cherry picked from commit `2ebf2658ff`) Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> * Backport PR #16423 to 8.x: DLQ-ing events that trigger an conditional evaluation error. (#16493) * DLQ-ing events that trigger an conditional evaluation error. (#16423) When a conditional evaluation encounter an error in the expression the event that triggered the issue is sent to pipeline's DLQ, if enabled for the executing pipeline. This PR engage with the work done in #16322, the `ConditionalEvaluationListener` that is receives notifications about if-statements evaluation failure, is improved to also send the event to DLQ (if enabled in the pipeline) and not just logging it. (cherry picked from commit `b69d993d71`) * Fixed warning about non serializable field DeadLetterQueueWriter in serializable AbstractPipelineExt --------- Co-authored-by: Andrea Selva <selva.andre@gmail.com> * add deprecation log for `--event_api.tags.illegal` (#16507) (#16515) - move `--event_api.tags.illegal` from option to deprecated_option - add deprecation log when the flag is explicitly used relates: #16356 Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> (cherry picked from commit `a4eddb8a2a`) Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com> --------- Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com> Co-authored-by: João Duarte <jsvd@users.noreply.github.com> Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> Co-authored-by: Andrea Selva <selva.andre@gmail.com> Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co> Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com> Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com> --------- Co-authored-by: ev1yehor <146825775+ev1yehor@users.noreply.github.com> Co-authored-by: João Duarte <jsvd@users.noreply.github.com> Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> Co-authored-by: Andrea Selva <selva.andre@gmail.com> Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> Co-authored-by: kaisecheng <69120390+kaisecheng@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Luca Belluccini <luca.belluccini@elastic.co> Co-authored-by: Edmo Vamerlatti Costa <11836452+edmocosta@users.noreply.github.com> Co-authored-by: Dimitrios Liappis <dimitrios.liappis@gmail.com>	2024-10-09 09:48:12 -07:00
Karen Metts	eff9b540df	Doc: Reposition worker-utilization in doc (#16335 )	2024-07-19 12:34:42 -04:00
Mashhur	948a0edf1a	Logstash monitoring doc improvements. (#16208 ) * Logstash monitoring doc improvements. --------- Co-authored-by: Rob Bavey <rob.bavey@elastic.co> Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>	2024-06-13 09:08:08 -07:00
Karen Metts	92b20bc184	Doc: Remove extra `+` to force page regen (#16000 )	2024-03-12 11:27:07 -04:00
Ry Biesemeyer	38e8c5d3f9	flow_metrics: pull `worker_utilization` up to pipeline-level (#15912 )	2024-02-06 11:50:34 -08:00
Mashhur	7063365739	Revert "[docs] Simplify and imrove Logstash monitoring docs. (#15847 )" (#15862 ) This reverts commit `9317052088`.	2024-01-26 14:18:41 -08:00
Mashhur	9317052088	[docs] Simplify and imrove Logstash monitoring docs. (#15847 )	2024-01-25 17:28:36 -08:00
Karen Metts	968fb24450	Doc: Add monitoring for serverless (#15636 )	2024-01-11 15:42:38 -05:00
Karen Metts	906c2513c3	Doc: Improvements to monitoring with agent (#15619 )	2023-11-27 14:18:06 -05:00
Karen Metts	c060c00d7c	Doc: Add Elastic Agent collection (#15528 ) Co-authored-by: Rob Bavey <rob.bavey@elastic.co>	2023-11-14 13:55:39 -05:00
Edmo Vamerlatti Costa	e76e582086	Add missing Elasticsearch SSL settings and replace deprecated options (xpack.monitoring and xpack.management) (#15045 ) This commit adds missing Elasticsearch SSL settings and replaces deprecated options being used on `xpack.monitoring.` and `xpack.management.` settings: Changes: - Updated deprecated monitoring and management Elasticsearch's SSL settings so no warnings are logged. - Added monitoring settings support for file-based certificates and for the cipher suites: `xpack.monitoring.elasticsearch.ssl.certificate`, `xpack.monitoring.elasticsearch.ssl.key`, and `xpack.monitoring.elasticsearch.ssl.cipher_suites`. - Added management settings support for file-based certificates and for the cipher suites: `xpack.management.elasticsearch.ssl.certificate`, `xpack.management.elasticsearch.ssl.key`, and `xpack.management.elasticsearch.ssl.cipher_suites`.	2023-05-15 11:54:38 +02:00
Ry Biesemeyer	519f3fb2e0	Plugin flow docs fixes (#14820 ) * docs: fix example block syntax types and truncations * docs: provide wrapping hints to flow metric tables * docs: refresh node stats api response examples include only `current` and `lifetime` metrics that are GA, and not technology preview metrics. * docs: use "m(onospace)" modifier for metric name columns * docs: swap literal column to first relies on `#guide table td:first-child .literal` having `white-space: nowrap`	2023-05-10 23:02:39 -07:00
Ry Biesemeyer	cb9316b486	document infinite flow metric rates (#14975 )	2023-04-11 18:43:37 +01:00
DeDe Morton	58abffce33	[DOCS] Describe how to use Elastic Agent to monitor Logstash (#14959 ) * [DOCS] Describe how to use Elastic Agent to monitor Logstash * Apply suggestions from code review Co-authored-by: Kevin Lacabane <klacabane@gmail.com> * Remove reviewer questions * Apply suggestions from code review Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com> * Fix security statement --------- Co-authored-by: Kevin Lacabane <klacabane@gmail.com> Co-authored-by: Karen Metts <35154725+karenzone@users.noreply.github.com>	2023-03-23 11:22:56 -07:00
Mashhur	cfafce23c6	Plugin throughput, worker utilization and worker cost per event flow metrics. (#14743 ) * Initial effort to initialize plugin flow metrics. Followings are addressed: - Namespace store is shaped with RubySymbol key but filter and output codecs were using string key. This commit intends to standardize the namespace key with RubySymbol for filter & output codecs. - Initializes throughput flow metrics for the input plugins. - Initializes the worker cost per event and worker utilization for the filter and output plugins with only uptime metrics but it should combine with worker count, will be implemented in next commits. - Fetching codec ID generated in ruby scope is possible but problematic to in Java scope. We will skip codec flow metrics since they are rarely produce the hard times. * Worker utilization metrics implementation. - Worker count will be provided as a fraction to the flow metrics. At the time when we fetch the metric value, fraction is applied. * Unit tests added for fractured extended & simple metrics. * Code review change requests applied. - To simplify the scale (or fraction) at metric get value time, we can introduce the wrapper (`UpScaleMetric`) that applies the scale at metric value fetch time. - Unit test added for `UpScaleMetric` - We don't touch the codec namespace shape for now since we skipped codec metrics. - Unused sources removed. * Worker utilization and worker cost per event explanation added in the documentation. * Integration test added for plugin-level flow metrics. * Apply suggestions from code review - Integration test failure fix: input plugin ID is not always in context config. - Suggestions to simplify integration test source and rollback to intentional namings. - Metrics explanation improvement in the doc. Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * plugin flow: fix units; pass UptimeMetric and scale when needed Aligns the units of the newly-introduced plugin metrics with the specification, and passes our `UptimeMetric` through to the individual helper methods so that they can scale appropriately for their context and our type-checker can ensure we don't receive an incorrectly-scaled `Metric<Long>`. Input `throughput` ------------------ all throughput metrics should be expressed in events-per-second; this per-plugin scoped view of the pipeline's `input_throughput` flow should be expressed in the same units. Filters, Outputs `worker_utilization` ------------------------------------- > a worker_utilization (duration / (uptime * worker count)) shows what percent > of available resources an individual plugin instance is taking and can help > identify where the blocker is. To achieve this, we need to divide millis used by _millis_ available. Filters, Outputs `worker_cost_per_event` ---------------------------------------- > we also provide a (to be named) cost-per-event metric (duration / event) to > surface issues with a plugin that operates on a very small subset of events > (via conditionals) but contributes disproportionately to the cost of getting > its events through. We start with a baseline of seconds-per-event, and acknowledge that this may need to be scaled to a more understandable number before merging. * plugin flow: express cost per event in millis per event The "worker cost per event" metric when expressed as an inverse per-worker throughput in seconds-per-event produces a range of values that are not particularly easy to compare at-a-glance, with "nearly free" operations being expressed in negative-exponent scientific notation and extremely expensive operations being expressed with single-digits. By scaling this metric up by a factor of 1000 to "millis per event" or its eqivalent "seconds per thousand events", the resulting numbers in practice are easier to make sense of: +------------------------+--------------+---------------+------------+ \| EXAMPLE / SCALE \| s/event \| ms/event \| µs/event \| +------------------------+--------------+---------------+------------+ \| no-op mutate @ 12k eps \| 8.33e-05 \| 0.0833 \| 83.3 \| \| stdout w/ dots codec \| 0.000831 \| 0.831 \| 831 \| \| ES out 1s RTT/125 \| 0.008 \| 8 \| 8000 \| \| ES out 30s retries/125 \| 0.24 \| 240 \| 240000 \| \| ES filter 1s/event \| 1 \| 1000 \| 1000000 \| \| grok 30s timeout \| 30 \| 30000 \| 30000000 \| +------------------------+--------------+---------------+------------+ * plugin flow: reshape docs Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>	2022-12-16 14:11:21 -08:00
Mashhur	f19e9cb647	Collect queue growth events and bytes metrics when PQ is enabled. (#14554 ) * Collect growth events and bytes metrics if PQ is enabled: Java changes. * Move queue flow under queue namespace. * Pipeline level PQ flow metrics: add unit & integration tests. * Include queue info in node stats sample. * Apply suggestions from code review Change uptime precision for PQ growth metrics to uptime seconds since PQ events are based on seconds. Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * Add safeguard when using lazy delegating gauge type. * flow metrics: simplify generics of lazy implementation Enables interface `FlowMetrics::create` to take suppliers that _implement_ a `Metric<? extends Number>` instead of requiring them to be pre-cast, and avoid unnecessary exposure of the metrics value-type into our lazy init. * flow metrics: use lazy init for PQ gauge-based metrics * noop: use enum equality Avoids routing two enum values through `MetricType#toString()` and `String#equals()` when they can be compared directly. * Apply suggestions from code review Optional.ofNullable used for safe return. Doc includes real tested expected metric values. Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * flow metrics: make lazy-init wraper inherit from AbstractMetric this allows the Jackson serialization annotations to work * flow metrics: move pipeline queue-based flows into pipeline flow namespace * Follow up for moving PQ growth metrics under pipeline..flow. - Unit and integration tests are added or fixed. - Documentation added along with sample response data flow: pipeline pq flow rates docs * Do not expect flow in the queue section of API. Metrics moved to flow section. Update logstash-core/spec/logstash/api/commands/stats_spec.rb Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * Integration test failure fix. Mistake: `flow_status` should be `pipeline_flow_stats` Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * Integration test failures fix. Number should be Numeric in the ruby specs. Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * Make CI happy. * api specs: use PQ only where needed Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>	2022-10-13 15:30:31 -07:00
Ry Biesemeyer	46babd6041	Extended Flow Metrics (#14571 ) * flow metrics: extract to interface, sharable-comon base, and implementation In preparation of landing an additional implementation of FlowMetric, we shuffle the current parts net-unchanged to provide interfaces for `FlowMetric` and `FlowCapture`, along with a sharable-common `BaseFlowMetric`, and move our initial implementation to a new `SimpleFlowMetric`, accessible only through a static factory method on our new `FlowMetric` interface. * flow-rates: refactor LIFETIME up to sharable base * util: add SetOnceReference * flow metrics: tolerate unavailable captures While the metrics we capture from in the initial release of FlowMetrics are all backed by `Metric<T extends Number>` whose values are non-null, we will need to capture from nullable `Gauge<Number>` in order to support persistent queue size and capacity metrics. This refactor uses the newly-introduced `SetOnceReference` to defer our baseline lifetime capture until one is available, and ensures `BaseFlowMetric#doCapture` creates a capture if-and-only-if non-null values are available from the provided metrics. * flow rates: limit precision for readability * flow metrics: introduce policy-driven extended windows implementation The new ExtendedFlowMetric is an alternate implementation of the FlowMetric introduced in Logstash 8.5.0 that is capable of producing windoes for a set of policies, which dictate the desired retention for the rate along with a desired resolution. - `current`: 10s retention, 1s resolution [] - `last_1_minute`: one minute retention, at 3s resolution [] - `last_5_minutes`: five minutes retention, at 15s resolution - `last_15_minutes`: fifteen minutes retention, at 30s resolution - `last_1_hour`: one hour retention, at 60s resolution - `last_24_hours`: one day retention at 15 minute resolution A given series may report a range for slightly longer than its configured retention period, up to the either the series' configured resolution or our capture rate (currently ~5s), whichever is greater. This approach allows us to retain sufficient data-points to present meaningful rolling averages while ensuring that our memory footprint is bounded. When recording these captures, we first stage the newest capture, and then promote the previously-staged caputure to the tail of a linked list IFF the gap between our new capture and the newest promoted capture is larger than our desired resolution. When _reading_ these rates, we compact the head of that linked list forward in time as far as possible without crossing the desired retention barrier, at which point the head points to the youngest record that is old enough to satisfy the period for the series. We also occesionally compact the head during writes, but only if the head is significantly out-of-date relative to the allowed retention. As implemented here, this extended flow rates are on by default, but can be disabled by setting the JVM system property `-Dlogstash.flowMetric=simple` * flow metrics: provide lazy-initiazed implementation * flow metrics: append lifetime baseline if available during init * flow metric tests: continuously monitor combined capture count * collection of unrelated minor code-review fixes * collection of even more unrelated minor code-review fixes	2022-10-06 18:35:33 -07:00
Ry Biesemeyer	6e0b365c92	Feature: flow metrics integration (#14518 ) * Flow metrics: initial implementation (#14509) * metrics: eliminate race condition when registering metrics Ensure our fast-lookup and store tables cannot diverge in a race condition by wrapping mutation of both in a single mutex and appropriately handle another thread winning the race to the lock by using the value that it persisted instead of writing our own. * metrics: guard against intermediate namespace conflicts - ensures our safeguard that prevents using an existing metric as a namespace is applied to _intermediate_ nodes, not just the tail-node, eliminating a potential crash when sending `fetch_or_store` to a metric object that is not expected to respond to `fetch_or_store`. - uses the atomic `Concurrent::Map#compute_if_absent` instead of the non-atomic `Concurrent::Map#fetch_or_store`, which is prone to last-write-wins during contention (as-written, this method is only executed under lock and not subject to contention) - uses `Enumerable#reduce` to eliminate the need for recursion * flow: introduce auto-advancing UptimeMetric * flow: introduce FlowMetric with minimal current/lifetime rates * flow: initialize pipeline metrics at pipeline start * Controller and service layer implementation for flow metrics. (#14514) * Controller and service layer implementation for flow metrics. * Add flow metrics to unit test and benchmark cli definitions. * flow: fix tests for metric types to accomodate new one * Renaming concurrency and backpressure metrics. Rename `concurrency` to `worker_concurrency ` and `backpressure` to `queue_backpressure` to provide proper scope naming. Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * metric: register flow metrics only when we have a collector (#14529) the collector is absent when the pipeline is run in test with a NullMetricExt, or when the pipeline is explicitly configured to not collect metrics using `metric.collect: false`. * Unit tests and integration tests added for flow metrics. (#14527) * Unit tests and integration tests added for flow metrics. * Node stat spec and pipeline spec metric updates. * Metric keys statically imported, implicit error expectation added in metric spec. * Fix node status API spec after renaming flow metrics. * Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates. * metric: register flow metrics only when we have a collector (#14529) the collector is absent when the pipeline is run in test with a NullMetricExt, or when the pipeline is explicitly configured to not collect metrics using `metric.collect: false`. * Unit tests and integration tests added for flow metrics. * Node stat spec and pipeline spec metric updates. * Metric keys statically imported, implicit error expectation added in metric spec. * Fix node status API spec after renaming flow metrics. * Removing flow metric from PipelinesInfo DS (used in peridoci metric snapshot), integration QA updates. * Rebasing with feature branch. * metric: register flow metrics only when we have a collector the collector is absent when the pipeline is run in test with a NullMetricExt, or when the pipeline is explicitly configured to not collect metrics using `metric.collect: false`. * Apply suggestions from code review Integration tests updated to test capturing the flow metrics. * Flow metrics expectation updated in tegration tests. * flow: refine integration expectations for reloads/monitoring Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co> Co-authored-by: Mashhur <mashhur.sattorov@gmail.com> * metric: add ScaledView with sub-unit precision to UptimeMetric (#14525) * metric: add ScaledView with sub-unit precision to UptimeMetric By presenting a _view_ of our metric that maintains sub-unit precision, we prevent jitter that can be caused by our periodic poller not running at exactly our configured cadence. This is especially important as the UptimeMetric is used as the _denominator_ of several flow metrics, and a capture at 4.999s that truncates to 4s, causes the rate to be over-reported by ~25%. The `UptimeMetric.ScaledView` implements `Metric<Number>`, so its full lossless `BigDecimal` value is accessible to our `FlowMetric` at query time. * metrics: reduce window for too-frequent-captures bug and document it * fixup: provide mocked clock to flow metric * Flow metrics cleanup (#14535) * flow metrics: code-style and readability pass * remove unused imports * cleanup: simplify usage of internal helpers * flow: migrate internals to use OptionalDouble * Flow metrics global (#14539) * flow: add global top-level flows * docs: add flow metrics * Top level flow metrics unit tests added. (#14540) * Top level flow metrics unit tests added. * Add unit tests when config reloads, make sure top-level flow metrics didn't get reset. * Apply suggestions from code review Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * Validating against Hash test cases updated. * For the safety check against exact type in unit tests. Co-authored-by: Ry Biesemeyer <yaauie@users.noreply.github.com> * docs: section links and clarity in node stats API flow metrics Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com> Co-authored-by: Mashhur <mashhur.sattorov@gmail.com>	2022-09-19 14:21:45 -07:00
Cleydyr Bezerra de Albuquerque	1ddd4ccd83	Fix broken link to image (#14343 )	2022-07-12 14:53:07 +01:00
Carlos Crespo	168732ff88	[doc] Removes 'beta' from pipeline viewer doc (#14082 )	2022-06-30 11:09:27 +09:00
Ry Biesemeyer	7757908c34	Add `ca_trusted_fingerprint` to core features (monitoring/central-management) (#14155 ) * add `ca_trusted_fingerprint` to core features (monitoring/central-management) * Rely on released ES output * fix: ensure commented-out examples in logstash.yml are functionally correct * add admonition for how to get a trusted CA's fingerprint	2022-06-28 17:07:59 -07:00
Mashhur	15dd1babf0	Simplifying HTTP basic password policy. (#14105 ) * Simplifying HTTP basic password policy.	2022-05-23 21:11:10 -07:00
Karen Metts	8edce82170	Doc: Clarify monitoring settings (#13871 ) Co-authored by: Dan Roscigno <dan@roscigno.com> Moves content from #10940 into updated file/file structure	2022-03-08 16:48:16 -05:00
Ry Biesemeyer	e9e7838d88	Docs remove homebrew (#13747 ) * docs: avoid promoting homebrew installation As of Stack 8.0, Elastic is no longer maintaining a separate homebrew cask with Elastic-licensed artifacts for the MacOS package manager homebrew, due to low usage, difficulty in maintaining multiple versions, lack of team expertise, and an existing Apache-licensed formula. As such, we are removing instructions about setting up and running Logstash from these formulae that are no longer available. * docs: fix typo in installation instructions	2022-02-09 11:40:55 -08:00
Ry Biesemeyer	15930ccd3e	Secure API (#13308 ) * settings: add "deprecated alias" support A deprecated alias provides a path for renaming a setting. - When a deprecated alias is set on its own, a deprecation notice is emitted but fetching the canonical setting value will reflect the value set with the deprecated alias. - When both the canonical setting (new name) and the deprecated alias (old name) are specified, it is an error condition. - When the value of the deprecated alias is queried, a warning is emitted to the logger and only the value explicitly set to the deprecated alias is returned. Additionally, some relevant cleanup is also included: - Starting Logstash with invalid settings no longer results in the obtuse "An unexpected error occurred" with backtrace and exception data obscuring the issue. Instead, a simple message is emitted indicating that the settings are invalid along with the originating exception's message. - The various settings implementations share a common logger, instead of each implementation class providing its own. This is aimed to reduce noise from the logs and to ensure specs validating logging do not need to tie so closely to implementation details. * settings: add password-wrapped setting * settings: make any setting type capable of being nullable * settings: add `Settings#names` to power programatic iteration * cli: route CLI-flag deprecations in to deprecation logger * settings: group API-related settings under `api.` retains deprecated aliases, and is fully backward-compatible. webserver: cleanup orphaned attr accessors for never-set ivars * api: pull settings extraction down from agent This net-no-change refactor introduces a new method `WebServer#from_settings` that bridges the gap between Logstash settings and Puma-related options, so that future additions to the API settings don't add complexity to the Agent. It also has the benefit of initializing the API Rack App and just ONCE, instead of once per attempted HTTP port. * api: add optional TLS/SSL * docs: reference API security settings * api: when configured securely, bind to all available interfaces by default * cleanup: remove unused cert artifacts * tests: generate fresh webserver certificates * certs: actually add the binary keystores 🤦	2021-10-19 14:13:20 -07:00
Mat Schaffer	8073b0c35e	Add beta tag to pipeline viewer docs (#13167 )	2021-09-06 10:31:31 +09:00
Karen Metts	a31a7a4736	Doc: Add geoip database API to node stats (#13019 )	2021-06-24 08:37:56 -04:00
Karen Metts	f481386039	Doc: Remove unused tagged regions (#12976 )	2021-06-09 19:51:07 -04:00
Karen Metts	001cefcf86	Doc:Replace outdated pipeline viewer screenshot	2020-07-09 07:31:32 -07:00
DeDe Morton	f40d1faf73	[DOCS] Change links to refactored Beats getting started docs	2020-07-08 10:28:18 -07:00
Karen Metts	587ff6921f	Doc:Add deprecation notice to legacy collection Resolves: #11979	2020-06-26 15:52:40 -07:00
Karen Metts	a839868b18	Doc:Rename internal collection to legacy collection Fixes #11858	2020-05-05 17:42:24 +00:00
Karen Metts	6126e29043	[Doc]Remove new internal collection (#11823 ) * [Doc]Remove new internal collection	2020-04-22 16:52:10 -04:00
Karen Metts	832310690d	[Doc]Doc updates for internal collectors (#11789 ) * Doc updates for internal collectors * Incorporate review comments * More review comments	2020-04-16 17:06:28 -04:00
Karen Metts	081ec78168	[Doc]Restructure monitoring docs to support new and legacy internal collectors (#11714 ) * [Doc] added description of xpack.monitoring.collection.write_direct.enabled setting * Added page to mark as deprecated the legacy internal collector and fixed all the `xpack.monitoring.` references Included legacy collector file into monitoring overview * Restructure monitoring docs * Incorporate review comments Co-authored-by: andsel <selva.andre@gmail.com>	2020-04-14 15:47:56 -04:00
Karen Metts	c47b232ee0	Remove deprecation notices Fixes #11624	2020-02-26 20:00:38 +00:00
lcawl	a42db55bbd	Fixes out-dated monitoring links Fixes #11629	2020-02-26 19:32:41 +00:00
Karen Metts	3c8b803fdb	Fix setting name for monitoring Fixes #11597	2020-02-12 20:05:00 +00:00
Karen Metts	17aeaccf3a	Add deprecation notice to internal collectors for monitoring Fixes #11526	2020-01-29 22:35:38 +00:00
andsel	3eb36bfa5e	Added section for monitoring.cluster_uuid Fixes #11538	2020-01-27 08:14:15 +00:00
Karen Metts	19605c8f1d	Remove ref to encrypted communications Fixes #11398	2019-12-06 14:49:11 +00:00
lcawl	e9ee1fd67c	Fixes monitoring link Fixes #11341	2019-11-26 17:24:38 +00:00
andsel	aad25d9bbc	Drop _xpack namespace for ES security and license endpoints Fixes #11297	2019-11-12 16:49:45 +00:00
lcawl	63c60622ff	Fixes links to Stack Overview Fixes #11239	2019-10-18 18:19:49 +00:00
Karen Metts	526d1aaf76	Add remaining review comments from #11033 Fixes #11197	2019-10-09 19:50:51 +00:00
Karen Metts	bcaf4788d5	Add metricbeat as monitoring option (#11033 ) Restructure content Restructure source files Incorporate review comments Incorporate more review comments and fix links	2019-10-03 18:39:20 -04:00
Lisa Cawley	533d5c169d	[DOCS] Fixes links to monitoring content (#11166 )	2019-09-30 08:58:11 -07:00

1 2

56 commits