Adds new section on managing multiline events.

This commit is contained in:
Paul Echeverri 2015-09-01 14:22:49 -07:00
parent 173ba00da8
commit f25b23498c

View file

@ -0,0 +1,116 @@
[[multiline]]
=== Managing Multiline Events
Several use cases generate events that span multiple lines of text. In order to correctly handle these multline events,
Logstash needs to know how to tell which lines are part of a single event.
Whenever possible, avoid using multiline processing at all. When multiline processing is required, implement the
processing as early in the pipeline as possible. The preferred tool in the Logstash pipeline is the
{logstash}plugins-codecs-multiline.html[multiline codec], which merges lines from a single input using a simple set of
rules.
For more complex needs, the {logstash}plugins-filters-multiline.html[multiline filter] performs a similar task at the
filter stage of processing, where the Logstash instance aggregates multiple inputs.
The most important aspects of configuring either multiline plugin are the following:
* The `pattern` option specifies a regular expression. Lines that match the specified regular expression are considered
either continuations of a previous line or the start of a new multiline event. You can use
{logstash}plugins-filters-grok.html[grok] regular expression templates with this configuration option.
* The `what` option takes two values: `previous` or `next`. The `previous` value specifies that lines that match the
value in the `pattern` option are part of the previous line. The `next` value specifies that lines that match the value
in the `pattern` option are part of the following line.* The `negate` option applies the multiline codec to lines that
_do not_ match the regular expression specified in the `pattern` option.
See the full documentation for the {logstash}plugins-codecs-multiline.html[multiline codec] or the
{logstash}plugins-filters-multiline.html[multiline filter]plugin for more information on configuration options.
==== Multiline Special Cases
* The multiline codec plugin does not suport the {logstash}plugins-inputs-lumberjack[lumberjack] input plugin.
* The multiline codec plugin does not support file input from files that contain events from multiple sources.
* The multiline filter plugin is not thread-safe. Avoid using multiple filter workers with the multiline filter.
==== Examples of Multiline Plugin Configuration
The examples in this section cover the following use cases:
* Combining a Java stack trace into a single event
* Combining C-style line continuations into a single event
* Combining multiple lines from time-stamped events
===== Java Stack Traces
Java stack traces consist of multiple lines, with each line after the initial line beginning with whitespace, as in
this example:
[source,java]
Exception in thread "main" java.lang.NullPointerException
at com.example.myproject.Book.getTitle(Book.java:16)
at com.example.myproject.Author.getBookTitles(Author.java:25)
at com.example.myproject.Bootstrap.main(Bootstrap.java:14)
To consolidate these lines into a single event in Logstash, use the following configuration for the multiline codec:
[source,json]
input {
stdin {
codec => multiline {
pattern => "^\s"
what => "previous"
}
}
}
This configuration merges any line that begins with whitespace up to the previous line.
===== Line Continuations
Several programming languages use the `\` character at the end of a line to denote that the line continues, as in this
example:
[source,c]
printf ("%10.10ld \t %10.10ld \t %s\
%f", w, x, y, z );
To consolidate these lines into a single event in Logstash, use the following configuration for the multiline codec:
[source,json]
input {
stdin {
codec => multiline {
pattern => "\\$"
what => "next"
}
}
}
This configuration merges any line that ends with the `\` character with the following line.
===== Timestamps
Activity logs from services such as Elasticsearch typically begin with a timestamp, followed by information on the
specific activity, as in this example:
[source,shell]
[2015-08-24 11:49:14,389][INFO ][env ] [Letha] using [1] data paths, mounts [[/
(/dev/disk1)]], net usable_space [34.5gb], net total_space [118.9gb], types [hfs]
To consolidate these lines into a single event in Logstash, use the following configuration for the multiline codec:
[source,json]
input {
file {
path => "/var/log/someapp.log"
codec => multiline {
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
}
}
}
This configuration uses the `negate` option to specify that any line that does not begin with a timestamp belongs to
the previous line.