Commit graph

835 commits

Author SHA1 Message Date
Joao Duarte
88eb425b00 fix logic of logging configs
Fixes #6789
2017-03-02 15:01:09 -05:00
Tal Levy
2e3b06b812 migrate core-queue-jruby into logstash-core (#6782) 2017-03-02 10:12:03 -08:00
Tal Levy
6fb8096a54 migrate logstash-core-event-java to logstash-core (#6760) 2017-03-01 15:31:17 -08:00
Colin Surprenant
568666c777 add queue drain option support
wip queue drain option

metrics on empty batches

reenabled spec

cosmetic fixes

stats collection, mutex, specs, empty batch handling

start_metrics
2017-03-01 14:12:27 -05:00
Joao Duarte
4981cf878c introduce locking in path.data
Fixes #6738
2017-02-24 05:27:09 -05:00
emile
efc0a0c598 add deep environment variables replacement in configuration 2017-02-21 14:13:03 -08:00
Colin Surprenant
ef93294028 use queue path in memory acked queue to namespace .lock file 2017-02-20 12:12:03 -05:00
Andrew Cholakian
2239068512 Setting --path.data on CLI should also change path.queue
This change was harder than it first appeared! Due to the complicated
interactions between our Setting class and our monkey-patched Clamp
classes this required adding some new hooks into various places to
properly intercept the settings at the right point and set this
dynamically.

Crucially, this only changes path.queue when the user has *not*
overriden it explicitly in the settings.yml file.

Fixes #6378 and #6387

Fixes #6731
2017-02-17 16:43:07 -05:00
Suyog Rao
09eaae3289 Don't show config content when reload is not triggered
Fixes #6720
2017-02-16 16:43:57 -05:00
Pier-Hugues Pellerin
9e6478519b Collecting the ThreadCount and the PeakThreadCount should not trigger a threaddump
When we use the JRMonitor library to get information about the running
threads it will trigger a thread dump to get the stacktrace information this OK when
we do a direct call to the `hot_threads` API but in the context of the
periodic poller this would mean that the threads need to be stopped to
generate their current stacktrace.

Which could significantly slow down logstash. This PR use the **ThreadMXBean** but only use the `#getThreadCount` and the `#getPeakThreadCount`. Theses two calls won't generate a hreaddump and won't block the currents
threads.

**To test** add the following options to your `config/jvm.options` and let logstash run for a few minutes to trigger a few periodic poller iteration and stop logstash you will see the report.

```
-XX:+PrintSafepointStatistics
-XX:PrintSafepointStatisticsCount=1
```

Fixes: #6603

Fixes #6705
2017-02-14 13:13:59 -05:00
Colin Surprenant
31693ddd22 cleanup partially initialized pipeline
DRY the plugin registration

typo

log exception message
2017-02-14 11:02:18 -05:00
Colin Surprenant
c6710cdbae support exclusive locking of PQ dir access
fix agent and pipeline and specs for queue exclusive access

added comments and swapped all sleep 0.01 to 0.1

revert explicit pipeline close in specs using sample helper

fix multiple pipelines specs

use BasePipeline for config validation which does not instantiate a new queue

review modifications

improve queue exception message
2017-02-10 17:36:03 -05:00
Suyog Rao
7a302d1064 Better asciidoc formatting and text for ID option
Fixes #6679
2017-02-09 14:39:20 -05:00
TAC
0f2ec00076 Fix document format about plugin id
Fixes #6636
2017-02-08 14:39:19 -05:00
Jordan Sissel
2d3da38743 Do not show the configuration content anymore. Plugin validation errors and other config problems are more specifically logged before we get to this point, and showing the full config can too easily obscure a more actional 'you used an invalid setting' kind of error.
Fixes #6654
2017-02-07 19:27:10 -05:00
Joao Duarte
6a5290e139 add setting class that coerces value to array
Fixes #6630
2017-02-07 06:42:56 -05:00
Joao Duarte
c10f55054b make SafeURI class clone deeper
Fixes #6645
2017-02-06 17:29:26 -05:00
Colin Surprenant
e503fcf60b refactor agent pipeline reloading to avoid double live pipelines with same settings
extracted BasePipeline class to support complete config validation

minor review changes

added comment
2017-02-03 18:15:34 -05:00
Joao Duarte
0e81924fe9 fix shutdown watcher reports
since 5.x introduced log4j2 as the main logging mechanism, it's
necessary to be more explicit when logging complex objects.

In this case we tell the logger to use the .to_s version of the Snapshot
report generated by the Watcher.
The Snapshot#to_s calls .to_simple_hash.to_s

Fixes #6628
2017-02-02 10:28:47 -05:00
Joao Duarte
23584a0799 propagate pipeline.id to api resources
this removes explicit references to the "main" pipeline,
using instead the value of the `pipeline.id` from LogStash::SETINGS

Fixes #6606
2017-01-30 12:46:19 -05:00
Joao Duarte
a2158e5608 ensure pipeline.id is correctly propagated
Fixes #6530
2017-01-26 11:15:36 -05:00
Joao Duarte
903cdd6331 include pipeline id in queue path
create queue sub directory based on pipeline id

Fixes #6540
2017-01-26 10:27:04 -05:00
Pier-Hugues Pellerin
2957ca3d46 Refactor non_reloadable_plugin
Instead of using a list of non reloadable plugin we add a new class
method on the base plugin class that the plugin will override.

By default we assume that all plugins are reloadable, only the stdin
shouldn't be reloadable.

Fixes #6499
2017-01-25 08:55:46 -05:00
Joao Duarte
c603d9e90b fix return value of start_pipeline when start_workers fails
during Agent#start_pipeline a new thread is launched that executes
a pipeline.run and a rescue block which increments the failed reload counter

After launching the thread, the parent thread will wait for the pipeline
to start, or detect that the pipeline aborted, or sleep and check again.

There is a bug that, if the pipeline.run aborts during start_workers,
the pipeline is still marked as `ready`, and the thread will continue
running for a very short period of time, incrementing the failed reload
metric.

During this period of `pipeline.ready? == true` and `thread.alive? == true`,
the parent check code will observe all the necessary conditions to
consider the pipeline.run to be succesful and thus increment the success
counter too. This failed reload can then result in both the success and
failure reload count being incremented.

This commit changes the parent thread check to use `pipeline.running?`
instead of `pipeline.ready?` which is the next logical state transition,
and ensures it is only true if `start_workers` runs successfuly.

Fixes #6566
2017-01-20 06:47:13 -05:00
Joao Duarte
c2a74b3d4f improve user facing error when yaml is incorrect
Fixes #6546
2017-01-19 12:14:20 -05:00
Pier-Hugues Pellerin
c32da068be review comments
Fixes #6498
2017-01-16 12:28:37 -05:00
Pier-Hugues Pellerin
84063ed74d Extract the creation of the queue into a factory
I've move the initialization code into a factory for future work on the
pipeline.

Fixes #6498
2017-01-16 12:28:36 -05:00
Suyog Rao
df2ff69a0e Renamed cgroups metric to number_of_elapsed_periods
Renamed number_of_periods to number_of_elapsed_periods to be consistent with ES.

Fixes #6536
2017-01-16 12:20:16 -05:00
Tal Levy
641b855127 remove current_size_in_bytes and acked info from node stats
re #6508.

- removed `acked_count`, `unacked_count`, and migrated `unread_count` to
top-level `events` field.
- removed `current_size_in_bytes` info from queue node stats

Fixes #6510
2017-01-09 19:19:28 -05:00
Mykola Shestopal
36328f5dac Fixed calculation of took_in_millis for #6476
Fixes #6481
2017-01-04 11:43:55 -05:00
Colin Surprenant
eb00b0da4c avoid resetting inexisting tags field back to empty array plus specs
Fixes #6477
2017-01-03 22:31:31 -05:00
Pier-Hugues Pellerin
9fe6c0cf43 Record the execution time for each output in the pipeline
Record the wall clock time for each output a new `duration_in_millis`
key will now be available for each output in the api located at http://localhost:9600/_node/stats

This commit also change some expectations in the output_delegator_spec
that were not working as intended with the `have_received` matcher.

Fixes #6458
2016-12-29 12:51:41 -05:00
Pier-Hugues Pellerin
55731eb936 Initialize the metric values in the batch to the correct type
When we were initilizing the `duration_in_millis` in the the batch we
were using a `Gauge` instead of a counter, since all the object have the
same signature when the were actually recording the time the value was
replaced instead of incremented.

Fixes #6465
2016-12-29 11:12:32 -05:00
Pier-Hugues Pellerin
52b6f963e1 Code cleanup for the collector observer
We have more the responsability of watching the collector inside the
input itself, this feature might come back when we have a new execution
model that can be improved in watching metrics. But this would require
more granular watchers.

No tests were affected by this changes since the code that required that
features was already removed.

Fixes: #6447

Fixes #6456
2016-12-29 08:59:52 -05:00
Pier-Hugues Pellerin
0fe94e779d Do not log a warning if a plugin is not from in #print_notice_version
When a plugin is loaded using the `plugins.path` option or is from a
universal plugin there no gemspec can be found for the specific plugin.

We should not print any warning on that case.

Fixes: #6444

Fixes #6448
2016-12-21 09:44:36 -05:00
Pier-Hugues Pellerin
13599ca64a Initialize with default values global events and pipeline events related metric
The metric store has no concept is a metric need to exist so as a rule
of thumb we need to defined them with 0 values and send them to the
store when we initialize something.

This PR make sure the batch object is recording the right default values

Fixes: #6449

Fixes #6450
2016-12-21 09:43:29 -05:00
Joao Duarte
08cebd5c16 ensure metric collection is disabled when metric.collect is false
Fixes #6445
2016-12-21 07:20:23 -05:00
Joao Duarte
128c660e0c add tests for webserver metric
Fixes #6385
2016-12-19 12:43:24 -05:00
Joao Duarte
7051e86dcf add webserver metric
Fixes #6385
2016-12-19 12:43:24 -05:00
Pier-Hugues Pellerin
fa18df3ec6 rename cgroup usage to usage_nanos to align with ES' api
Fixes #6428
2016-12-16 10:19:59 -05:00
Tal Levy
2b45a9b4ae add queue stats to node/stats api (#6331)
* add queue stats to node/stats api

example queue section:

```
"queue" : {
    "type" : "persisted",
    "capacity" : {
      "page_capacity_in_bytes" : 262144000,
      "max_queue_size_in_bytes" : 1073741824,
      "max_unread_events" : 0
    },
    "data" : {
      "free_space_in_bytes" : 33851523072,
      "current_size_in_bytes" : 262144000,
      "storage_type" : "hfs",
      "path" : "/logstash/data/queue"
    },
    "events" : {
      "acked_count" : 0,
      "unread_count" : 0,
      "unacked_count" : 0
    }
}
```

Closes #6182.

* migrate to use period metric pollers for storing queue stats per pipeline
2016-12-15 13:14:47 -08:00
Pier-Hugues Pellerin
9c8bf203e2 Add cgroup information to the api
When logstash is run under a linux container we will gather statistic about the cgroup and the
cpu usage. This information will should in the /_node/stats api and the result will look like this:

```
  "os" : {
    "cgroup" : {
      "cpuacct" : {
        "usage" : 789470280230,
        "control_group" : "/user.slice/user-1000.slice"
      },
      "cpu" : {
        "cfs_quota_micros" : -1,
        "control_group" : "/user.slice/user-1000.slice",
        "stat" : {
          "number_of_times_throttled" : 0,
          "time_throttled_nanos" : 0,
          "number_of_periods" : 0
        },
        "cfs_period_micros" : 100000
      }
    }
  }
```

Fixes: #6252

Fixes #6357
2016-12-15 15:46:38 -05:00
Jordan Sissel
8dfefad58a Remove unnecessary log4j-1.2-api depedency.
This library provides a "log4j 1.2"-like API from the log4j2 library.

We don't seem to use this, and including it seems to be the cause of the
Logstash log4j input rejecting log4j 1.x's SocketAppender with this
message:

    org.apache.log4j.spi.LoggingEvent; class invalid for deserialization

The origin of this error is that log4j2's log4j-1.2-api defines
LoggingEvent without `implements Serializable`.

This commit also includes regenerated gemspec_jars.rb and
logstash-core_jars.rb.

Reference: https://github.com/logstash-plugins/logstash-input-log4j/issues/36

Fixes #6309
2016-12-14 02:19:57 -05:00
Suyog Rao
134795360c Expose stats the right way
Fixes #6367
2016-12-13 00:59:24 -05:00
Pier-Hugues Pellerin
48f3624302 Do not include Utils, this could cause some bad references on the LogStash::Environment in the context of stand alone gems
Fixes #6377
2016-12-08 15:00:54 -05:00
Guy Boertje
eba128c968 Allow for exception instances to get serialized in JSON logging 2016-11-30 15:43:57 -08:00
Joao Duarte
bc3bcfde24 rename queueMaxSizeInBytes to queueMaxBytes and currentSize to currentByteSize
Fixes #6297
2016-11-29 05:10:42 -05:00
Joao Duarte
4a5aa90466 add ruby wiring to the queue.max_size setting
Fixes #6297
2016-11-29 05:10:42 -05:00
Colin Surprenant
f636a751f8 add support for queue.checkpoint.{acks|writes} settings
add queue.max_acked_checkpoint and queue.checkpoint_rate settings

now using checkpoint.max_acks, checkpoint.max_writes and checkpoint.max_interval

rename options

wip rework checkpointing

refactored full acked pages handling on acking and recovery

correclty close queue

proper queue open/recovery

checkpoint dump utility

checkpoint on writes

removed debug code and added missing newline

added better comment on contiguous checkpoints

fix spec for new pipeline setting
2016-11-22 14:48:58 -05:00
Andrew Cholakian
c03f30fedb Fix filter 'id' naming to be consistent with inputs / outputs
The filters were the only category prefixing IDs with their plugin names

Fixes #6259
2016-11-22 13:35:30 -05:00