Commit graph

1649 commits

Author SHA1 Message Date
Suyog Rao
7a302d1064 Better asciidoc formatting and text for ID option
Fixes #6679
2017-02-09 14:39:20 -05:00
TAC
0f2ec00076 Fix document format about plugin id
Fixes #6636
2017-02-08 14:39:19 -05:00
Jordan Sissel
2d3da38743 Do not show the configuration content anymore. Plugin validation errors and other config problems are more specifically logged before we get to this point, and showing the full config can too easily obscure a more actional 'you used an invalid setting' kind of error.
Fixes #6654
2017-02-07 19:27:10 -05:00
Joao Duarte
6a5290e139 add setting class that coerces value to array
Fixes #6630
2017-02-07 06:42:56 -05:00
Joao Duarte
c10f55054b make SafeURI class clone deeper
Fixes #6645
2017-02-06 17:29:26 -05:00
Colin Surprenant
e503fcf60b refactor agent pipeline reloading to avoid double live pipelines with same settings
extracted BasePipeline class to support complete config validation

minor review changes

added comment
2017-02-03 18:15:34 -05:00
Colin Surprenant
7fd2f28c42 support releaseLock
added usage comment
2017-02-03 13:21:56 -05:00
Joao Duarte
0e81924fe9 fix shutdown watcher reports
since 5.x introduced log4j2 as the main logging mechanism, it's
necessary to be more explicit when logging complex objects.

In this case we tell the logger to use the .to_s version of the Snapshot
report generated by the Watcher.
The Snapshot#to_s calls .to_simple_hash.to_s

Fixes #6628
2017-02-02 10:28:47 -05:00
Colin Surprenant
d79e3730fe exclusive file locking class and tests
use new bin/ruby and bundler without :development

refactor to DRY and use expected exception

added original Apache 2.0 license and some cosmetics

exclude bin/lock from packaging

rename variables
2017-01-31 13:47:07 -05:00
Colin Surprenant
356dd71e1e optimistic recovery of queue head page on open
recover method wip

recover method wip

recevered lenght 0 is invalid

fix firstUnackedSeqNum

DRYied open and recover

DRYed and refactored to extract AbstractByteBufferPageIO from ByteBufferPageIO and MmapPageIO

better exception messages

cleanup and remove subject idiom

rename _mapFile to mapFile

added invalid state recovery tests

use log4j

renamed TestSettings methods to improve readability

duplicate code

add version check

typo

use test exceptions annotation

use parametrized tests

added uncheck() method to clean test stream

add better message todo

proper javadoc comment

typo
2017-01-31 11:28:47 -05:00
Joao Duarte
23584a0799 propagate pipeline.id to api resources
this removes explicit references to the "main" pipeline,
using instead the value of the `pipeline.id` from LogStash::SETINGS

Fixes #6606
2017-01-30 12:46:19 -05:00
Joao Duarte
a2158e5608 ensure pipeline.id is correctly propagated
Fixes #6530
2017-01-26 11:15:36 -05:00
Joao Duarte
903cdd6331 include pipeline id in queue path
create queue sub directory based on pipeline id

Fixes #6540
2017-01-26 10:27:04 -05:00
Pier-Hugues Pellerin
db13b08d26 missing specs for the refactoring
Fixes #6499
2017-01-25 08:55:46 -05:00
Pier-Hugues Pellerin
2957ca3d46 Refactor non_reloadable_plugin
Instead of using a list of non reloadable plugin we add a new class
method on the base plugin class that the plugin will override.

By default we assume that all plugins are reloadable, only the stdin
shouldn't be reloadable.

Fixes #6499
2017-01-25 08:55:46 -05:00
Joao Duarte
c3a8fb09d4 use running instead of read in some agent specs
the pipeline class two state predicates: ready? and running?

ready? becomes true after `start_workers` terminates (succesfuly or not)
running? becomes true before calling `start_flusher`, which means that
`start_workers` is guaranteed to have terminated successfuly

Whenever possible, we should use `running?` instead of `ready?` in the
spec setup blocks. The only place where this may be bad is when the
pipeline execution is short lived (e.g. generator w/small count) and the
spec may never observe pipeline.running? == true

Fixes #6574
2017-01-24 05:29:35 -05:00
Joao Duarte
c603d9e90b fix return value of start_pipeline when start_workers fails
during Agent#start_pipeline a new thread is launched that executes
a pipeline.run and a rescue block which increments the failed reload counter

After launching the thread, the parent thread will wait for the pipeline
to start, or detect that the pipeline aborted, or sleep and check again.

There is a bug that, if the pipeline.run aborts during start_workers,
the pipeline is still marked as `ready`, and the thread will continue
running for a very short period of time, incrementing the failed reload
metric.

During this period of `pipeline.ready? == true` and `thread.alive? == true`,
the parent check code will observe all the necessary conditions to
consider the pipeline.run to be succesful and thus increment the success
counter too. This failed reload can then result in both the success and
failure reload count being incremented.

This commit changes the parent thread check to use `pipeline.running?`
instead of `pipeline.ready?` which is the next logical state transition,
and ensures it is only true if `start_workers` runs successfuly.

Fixes #6566
2017-01-20 06:47:13 -05:00
Joao Duarte
c2a74b3d4f improve user facing error when yaml is incorrect
Fixes #6546
2017-01-19 12:14:20 -05:00
Pier-Hugues Pellerin
c32da068be review comments
Fixes #6498
2017-01-16 12:28:37 -05:00
Pier-Hugues Pellerin
84063ed74d Extract the creation of the queue into a factory
I've move the initialization code into a factory for future work on the
pipeline.

Fixes #6498
2017-01-16 12:28:36 -05:00
Suyog Rao
df2ff69a0e Renamed cgroups metric to number_of_elapsed_periods
Renamed number_of_periods to number_of_elapsed_periods to be consistent with ES.

Fixes #6536
2017-01-16 12:20:16 -05:00
Tal Levy
641b855127 remove current_size_in_bytes and acked info from node stats
re #6508.

- removed `acked_count`, `unacked_count`, and migrated `unread_count` to
top-level `events` field.
- removed `current_size_in_bytes` info from queue node stats

Fixes #6510
2017-01-09 19:19:28 -05:00
Mykola Shestopal
36328f5dac Fixed calculation of took_in_millis for #6476
Fixes #6481
2017-01-04 11:43:55 -05:00
Colin Surprenant
eb00b0da4c avoid resetting inexisting tags field back to empty array plus specs
Fixes #6477
2017-01-03 22:31:31 -05:00
Pier-Hugues Pellerin
9fe6c0cf43 Record the execution time for each output in the pipeline
Record the wall clock time for each output a new `duration_in_millis`
key will now be available for each output in the api located at http://localhost:9600/_node/stats

This commit also change some expectations in the output_delegator_spec
that were not working as intended with the `have_received` matcher.

Fixes #6458
2016-12-29 12:51:41 -05:00
Pier-Hugues Pellerin
55731eb936 Initialize the metric values in the batch to the correct type
When we were initilizing the `duration_in_millis` in the the batch we
were using a `Gauge` instead of a counter, since all the object have the
same signature when the were actually recording the time the value was
replaced instead of incremented.

Fixes #6465
2016-12-29 11:12:32 -05:00
Pier-Hugues Pellerin
7b4373789f Add a test to make sure the Collector#snapshot_metric returns a cloned metric store.
Fixes #6456
2016-12-29 08:59:52 -05:00
Pier-Hugues Pellerin
52b6f963e1 Code cleanup for the collector observer
We have more the responsability of watching the collector inside the
input itself, this feature might come back when we have a new execution
model that can be improved in watching metrics. But this would require
more granular watchers.

No tests were affected by this changes since the code that required that
features was already removed.

Fixes: #6447

Fixes #6456
2016-12-29 08:59:52 -05:00
Colin Surprenant
e433abdbc1 move options as constant
Fixes #6430
2016-12-23 14:34:44 -05:00
Colin Surprenant
a98983aa10 add explit fsync on checkpoint write
Fixes #6430
2016-12-23 14:34:43 -05:00
Pier-Hugues Pellerin
0fe94e779d Do not log a warning if a plugin is not from in #print_notice_version
When a plugin is loaded using the `plugins.path` option or is from a
universal plugin there no gemspec can be found for the specific plugin.

We should not print any warning on that case.

Fixes: #6444

Fixes #6448
2016-12-21 09:44:36 -05:00
Pier-Hugues Pellerin
13599ca64a Initialize with default values global events and pipeline events related metric
The metric store has no concept is a metric need to exist so as a rule
of thumb we need to defined them with 0 values and send them to the
store when we initialize something.

This PR make sure the batch object is recording the right default values

Fixes: #6449

Fixes #6450
2016-12-21 09:43:29 -05:00
Joao Duarte
08cebd5c16 ensure metric collection is disabled when metric.collect is false
Fixes #6445
2016-12-21 07:20:23 -05:00
Joao Duarte
128c660e0c add tests for webserver metric
Fixes #6385
2016-12-19 12:43:24 -05:00
Joao Duarte
7051e86dcf add webserver metric
Fixes #6385
2016-12-19 12:43:24 -05:00
Pier-Hugues Pellerin
fa18df3ec6 rename cgroup usage to usage_nanos to align with ES' api
Fixes #6428
2016-12-16 10:19:59 -05:00
Tal Levy
2b45a9b4ae add queue stats to node/stats api (#6331)
* add queue stats to node/stats api

example queue section:

```
"queue" : {
    "type" : "persisted",
    "capacity" : {
      "page_capacity_in_bytes" : 262144000,
      "max_queue_size_in_bytes" : 1073741824,
      "max_unread_events" : 0
    },
    "data" : {
      "free_space_in_bytes" : 33851523072,
      "current_size_in_bytes" : 262144000,
      "storage_type" : "hfs",
      "path" : "/logstash/data/queue"
    },
    "events" : {
      "acked_count" : 0,
      "unread_count" : 0,
      "unacked_count" : 0
    }
}
```

Closes #6182.

* migrate to use period metric pollers for storing queue stats per pipeline
2016-12-15 13:14:47 -08:00
Pier-Hugues Pellerin
9c8bf203e2 Add cgroup information to the api
When logstash is run under a linux container we will gather statistic about the cgroup and the
cpu usage. This information will should in the /_node/stats api and the result will look like this:

```
  "os" : {
    "cgroup" : {
      "cpuacct" : {
        "usage" : 789470280230,
        "control_group" : "/user.slice/user-1000.slice"
      },
      "cpu" : {
        "cfs_quota_micros" : -1,
        "control_group" : "/user.slice/user-1000.slice",
        "stat" : {
          "number_of_times_throttled" : 0,
          "time_throttled_nanos" : 0,
          "number_of_periods" : 0
        },
        "cfs_period_micros" : 100000
      }
    }
  }
```

Fixes: #6252

Fixes #6357
2016-12-15 15:46:38 -05:00
Jordan Sissel
8dfefad58a Remove unnecessary log4j-1.2-api depedency.
This library provides a "log4j 1.2"-like API from the log4j2 library.

We don't seem to use this, and including it seems to be the cause of the
Logstash log4j input rejecting log4j 1.x's SocketAppender with this
message:

    org.apache.log4j.spi.LoggingEvent; class invalid for deserialization

The origin of this error is that log4j2's log4j-1.2-api defines
LoggingEvent without `implements Serializable`.

This commit also includes regenerated gemspec_jars.rb and
logstash-core_jars.rb.

Reference: https://github.com/logstash-plugins/logstash-input-log4j/issues/36

Fixes #6309
2016-12-14 02:19:57 -05:00
Suyog Rao
c8826b81ab Add reload stats at the instance level
Reload stats are currently reported at the pipeline level. The instance
level aggregates these stats across the pipelines

Fixes #6350

Fixes #6367
2016-12-13 00:59:24 -05:00
Suyog Rao
134795360c Expose stats the right way
Fixes #6367
2016-12-13 00:59:24 -05:00
Pier-Hugues Pellerin
19d5dc1de5 Only include logstash-core.jar in the published gem
Fixes #6377
2016-12-08 15:00:54 -05:00
Pier-Hugues Pellerin
48f3624302 Do not include Utils, this could cause some bad references on the LogStash::Environment in the context of stand alone gems
Fixes #6377
2016-12-08 15:00:54 -05:00
Pier-Hugues Pellerin
ba83f100ac Impose strict version to logstash-core-event-java and logstash-core-queue-jruby
Fixes #6380
2016-12-08 14:29:28 -05:00
Guy Boertje
42ec61987d Update logstash-core.gemspec 2016-12-08 16:15:15 +00:00
Pier-Hugues Pellerin
e3209a3033 Change the assertions in the config reloading spec
The assertions was using dummy outputs and kept received events into an
array in memory, but the test actually only needed to match the number
of events it received, this PR add a DroppingDummyOutput that wont
retain the events in memory.

The previous implementation was causing a OOM issue when running the
test on a very fast machine.

Fixes: #6335

Fixes #6346
2016-12-02 13:24:08 -05:00
Guy Boertje
eba128c968 Allow for exception instances to get serialized in JSON logging 2016-11-30 15:43:57 -08:00
Joao Duarte
9a1a8f0f12 minor style change in getQueueMaxBytes implementation
Fixes #6297
2016-11-29 05:10:42 -05:00
Joao Duarte
209a3aaa21 make some Queue tests less aggressive on Thread.sleep usage
Fixes #6297
2016-11-29 05:10:42 -05:00
Joao Duarte
bc3bcfde24 rename queueMaxSizeInBytes to queueMaxBytes and currentSize to currentByteSize
Fixes #6297
2016-11-29 05:10:42 -05:00