Cleanup docs directory
Remove old, unused markdown docs Bring dir structure to mirror logstash-docs repo
3
.gitignore
vendored
|
@ -26,3 +26,6 @@ spec/reports
|
|||
rspec.xml
|
||||
.install-done
|
||||
.vendor
|
||||
integration_run
|
||||
.mvn/
|
||||
|
||||
|
|
|
@ -1,322 +0,0 @@
|
|||
---
|
||||
title: Configuration Language - Logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Logstash Config Language
|
||||
|
||||
The Logstash config language aims to be simple.
|
||||
|
||||
There are 3 main sections: inputs, filters, outputs. Each section has
|
||||
configurations for each plugin available in that section.
|
||||
|
||||
Example:
|
||||
|
||||
# This is a comment. You should use comments to describe
|
||||
# parts of your configuration.
|
||||
input {
|
||||
...
|
||||
}
|
||||
|
||||
filter {
|
||||
...
|
||||
}
|
||||
|
||||
output {
|
||||
...
|
||||
}
|
||||
|
||||
## Filters and Ordering
|
||||
|
||||
For a given event, are applied in the order of appearance in the
|
||||
configuration file.
|
||||
|
||||
## Comments
|
||||
|
||||
Comments are the same as in ruby, perl, and python. Starts with a '#' character.
|
||||
Example:
|
||||
|
||||
# this is a comment
|
||||
|
||||
input { # comments can appear at the end of a line, too
|
||||
# ...
|
||||
}
|
||||
|
||||
## Plugins
|
||||
|
||||
The input, filter and output sections all let you configure plugins. Plugin
|
||||
configuration consists of the plugin name followed by a block of settings for
|
||||
that plugin. For example, how about two file inputs:
|
||||
|
||||
input {
|
||||
file {
|
||||
path => "/var/log/messages"
|
||||
type => "syslog"
|
||||
}
|
||||
|
||||
file {
|
||||
path => "/var/log/apache/access.log"
|
||||
type => "apache"
|
||||
}
|
||||
}
|
||||
|
||||
The above configures two file separate inputs. Both set two
|
||||
configuration settings each: 'path' and 'type'. Each plugin has different
|
||||
settings for configuring it; seek the documentation for your plugin to
|
||||
learn what settings are available and what they mean. For example, the
|
||||
[file input][fileinput] documentation will explain the meanings of the
|
||||
path and type settings.
|
||||
|
||||
[fileinput]: inputs/file
|
||||
|
||||
## Value Types
|
||||
|
||||
The documentation for a plugin may enforce a configuration field having a
|
||||
certain type. Examples include boolean, string, array, number, hash,
|
||||
etc.
|
||||
|
||||
### <a name="boolean"></a>Boolean
|
||||
|
||||
A boolean must be either `true` or `false`. Note the lack of quotes around
|
||||
`true` and `false`.
|
||||
|
||||
Examples:
|
||||
|
||||
debug => true
|
||||
|
||||
### <a name="string"></a>String
|
||||
|
||||
A string must be a single value.
|
||||
|
||||
Example:
|
||||
|
||||
name => "Hello world"
|
||||
|
||||
Single, unquoted words are valid as strings, too, but you should use quotes.
|
||||
|
||||
### <a name="number"></a>Number
|
||||
|
||||
Numbers must be valid numerics (floating point or integer are OK).
|
||||
|
||||
Example:
|
||||
|
||||
port => 33
|
||||
|
||||
### <a name="array"></a>Array
|
||||
|
||||
An array can be a single string value or multiple. If you specify the same
|
||||
field multiple times, it appends to the array.
|
||||
|
||||
Examples:
|
||||
|
||||
path => [ "/var/log/messages", "/var/log/*.log" ]
|
||||
path => "/data/mysql/mysql.log"
|
||||
|
||||
The above makes 'path' a 3-element array including all 3 strings.
|
||||
|
||||
### <a name="hash"></a>Hash
|
||||
|
||||
A hash is basically the same syntax as Ruby hashes.
|
||||
The key and value are simply pairs, such as:
|
||||
|
||||
match => {
|
||||
"field1" => "value1"
|
||||
"field2" => "value2"
|
||||
...
|
||||
}
|
||||
|
||||
## <a name="eventdependent"></a>Event Dependent Configuration
|
||||
|
||||
The logstash agent is a processing pipeline with 3 stages: inputs -> filters ->
|
||||
outputs. Inputs generate events, filters modify them, outputs ship them
|
||||
elsewhere.
|
||||
|
||||
All events have properties. For example, an apache access log would have things
|
||||
like status code (200, 404), request path ("/", "index.html"), HTTP verb
|
||||
(GET, POST), client IP address, etc. Logstash calls these properties "fields."
|
||||
|
||||
Some of the configuration options in Logstash require the existence of fields in
|
||||
order to function. Because inputs generate events, there are no fields to
|
||||
evaluate within the input block--they do not exist yet!
|
||||
|
||||
Because of their dependency on events and fields, the following configuration
|
||||
options will only work within filter and output blocks.
|
||||
|
||||
**IMPORTANT: Field references, sprintf format and conditionals, described below,
|
||||
will not work in an input block.
|
||||
|
||||
### <a name="fieldreferences"></a>Field References
|
||||
|
||||
In many cases, it is useful to be able to refer to a field by name. To do this,
|
||||
you can use the Logstash field reference syntax.
|
||||
|
||||
By way of example, let us suppose we have this event:
|
||||
|
||||
{
|
||||
"agent": "Mozilla/5.0 (compatible; MSIE 9.0)",
|
||||
"ip": "192.168.24.44",
|
||||
"request": "/index.html"
|
||||
"response": {
|
||||
"status": 200,
|
||||
"bytes": 52353
|
||||
},
|
||||
"ua": {
|
||||
"os": "Windows 7"
|
||||
}
|
||||
}
|
||||
|
||||
- the syntax to access fields is `[fieldname]`.
|
||||
- if you are only referring to a **top-level field**, you can omit the `[]` and
|
||||
simply say `fieldname`.
|
||||
- in the case of **nested fields**, like the "os" field above, you need
|
||||
the full path to that field: `[ua][os]`.
|
||||
|
||||
### <a name="sprintf"></a>sprintf format
|
||||
|
||||
This syntax is also used in what Logstash calls 'sprintf format'. This format
|
||||
allows you to refer to field values from within other strings. For example, the
|
||||
statsd output has an 'increment' setting, to allow you to keep a count of
|
||||
apache logs by status code:
|
||||
|
||||
output {
|
||||
statsd {
|
||||
increment => "apache.%{[response][status]}"
|
||||
}
|
||||
}
|
||||
|
||||
You can also do time formatting in this sprintf format. Instead of specifying a
|
||||
field name, use the `+FORMAT` syntax where `FORMAT` is a
|
||||
[time format](http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html).
|
||||
|
||||
For example, if you want to use the file output to write to logs based on the
|
||||
hour and the 'type' field:
|
||||
|
||||
output {
|
||||
file {
|
||||
path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}"
|
||||
}
|
||||
}
|
||||
|
||||
### <a name="conditionals"></a>Conditionals
|
||||
|
||||
Sometimes you only want a filter or output to process an event under
|
||||
certain conditions. For that, you'll want to use a conditional!
|
||||
|
||||
Conditionals in Logstash look and act the same way they do in programming
|
||||
languages. You have `if`, `else if` and `else` statements. Conditionals may be
|
||||
nested if you need that.
|
||||
|
||||
The syntax is follows:
|
||||
|
||||
if EXPRESSION {
|
||||
...
|
||||
} else if EXPRESSION {
|
||||
...
|
||||
} else {
|
||||
...
|
||||
}
|
||||
|
||||
What's an expression? Comparison tests, boolean logic, etc!
|
||||
|
||||
The following comparison operators are supported:
|
||||
|
||||
* equality, etc: ==, !=, <, >, <=, >=
|
||||
* regexp: =~, !~
|
||||
* inclusion: in, not in
|
||||
|
||||
The following boolean operators are supported:
|
||||
|
||||
* and, or, nand, xor
|
||||
|
||||
The following unary operators are supported:
|
||||
|
||||
* !
|
||||
|
||||
Expressions may contain expressions. Expressions may be negated with `!`.
|
||||
Expressions may be grouped with parentheses `(...)`. Expressions can be long
|
||||
and complex.
|
||||
|
||||
For example, if we want to remove the field `secret` if the field
|
||||
`action` has a value of `login`:
|
||||
|
||||
filter {
|
||||
if [action] == "login" {
|
||||
mutate { remove => "secret" }
|
||||
}
|
||||
}
|
||||
|
||||
The above uses the field reference syntax to get the value of the
|
||||
`action` field. It is compared against the text `login` and, if equal,
|
||||
allows the mutate filter to delete the field named `secret`.
|
||||
|
||||
How about a more complex example?
|
||||
|
||||
* alert nagios of any apache events with status 5xx
|
||||
* record any 4xx status to elasticsearch
|
||||
* record all status code hits via statsd
|
||||
|
||||
How about telling nagios of any http event that has a status code of 5xx?
|
||||
|
||||
output {
|
||||
if [type] == "apache" {
|
||||
if [status] =~ /^5\d\d/ {
|
||||
nagios { ... }
|
||||
} else if [status] =~ /^4\d\d/ {
|
||||
elasticsearch { ... }
|
||||
}
|
||||
|
||||
statsd { increment => "apache.%{status}" }
|
||||
}
|
||||
}
|
||||
|
||||
You can also do multiple expressions in a single condition:
|
||||
|
||||
output {
|
||||
# Send production errors to pagerduty
|
||||
if [loglevel] == "ERROR" and [deployment] == "production" {
|
||||
pagerduty {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
You can test whether a field was present, regardless of its value:
|
||||
|
||||
if [exception_message] {
|
||||
# If the event has an exception_message field, set the level
|
||||
mutate { add_field => { "level" => "ERROR" } }
|
||||
}
|
||||
|
||||
Here are some examples for testing with the in conditional:
|
||||
|
||||
filter {
|
||||
if [foo] in [foobar] {
|
||||
mutate { add_tag => "field in field" }
|
||||
}
|
||||
if [foo] in "foo" {
|
||||
mutate { add_tag => "field in string" }
|
||||
}
|
||||
if "hello" in [greeting] {
|
||||
mutate { add_tag => "string in field" }
|
||||
}
|
||||
if [foo] in ["hello", "world", "foo"] {
|
||||
mutate { add_tag => "field in list" }
|
||||
}
|
||||
if [missing] in [alsomissing] {
|
||||
mutate { add_tag => "shouldnotexist" }
|
||||
}
|
||||
if !("foo" in ["hello", "world"]) {
|
||||
mutate { add_tag => "shouldexist" }
|
||||
}
|
||||
}
|
||||
|
||||
Or, to test if grok was successful:
|
||||
|
||||
output {
|
||||
if "_grokparsefailure" not in [tags] {
|
||||
elasticsearch { ... }
|
||||
}
|
||||
}
|
||||
|
||||
## Further Reading
|
||||
|
||||
For more information, see [the plugin docs index](index)
|
|
@ -1,59 +0,0 @@
|
|||
---
|
||||
title: Logstash Contrib plugins
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
# contrib plugins
|
||||
|
||||
As logstash has grown, we've accumulated a massive repository of plugins. Well
|
||||
over 100 plugins, it became difficult for the project maintainers to adequately
|
||||
support everything effectively.
|
||||
|
||||
In order to improve the quality of popular plugins, we've moved the
|
||||
less-commonly-used plugins to a separate repository we're calling "contrib".
|
||||
Concentrating common plugin usage into core solves a few problems, most notably
|
||||
user complaints about the size of logstash releases, support/maintenance costs,
|
||||
etc.
|
||||
|
||||
It is our intent that this separation will improve life for users. If it
|
||||
doesn't, please file a bug so we can work to address it!
|
||||
|
||||
If a plugin is available in the 'contrib' package, the documentation for that
|
||||
plugin will note this boldly at the top of that plugin's documentation.
|
||||
|
||||
Contrib plugins reside in a [separate github project](https://github.com/elasticsearch/logstash-contrib).
|
||||
|
||||
# Packaging
|
||||
|
||||
At present, the contrib modules are available as a tarball.
|
||||
|
||||
# Automated Installation
|
||||
|
||||
The `bin/plugin` script will handle the installation for you:
|
||||
|
||||
cd /path/to/logstash
|
||||
bin/plugin install contrib
|
||||
|
||||
# Manual Installation
|
||||
|
||||
The contrib plugins can be extracted on top of an existing Logstash installation.
|
||||
|
||||
For example, if I've extracted `logstash-%VERSION%.tar.gz` into `/path`, e.g.
|
||||
|
||||
cd /path
|
||||
tar zxf ~/logstash-%VERSION%.tar.gz
|
||||
|
||||
It will have a `/path/logstash-%VERSION%` directory, e.g.
|
||||
|
||||
$ ls
|
||||
logstash-%VERSION%
|
||||
|
||||
The method to install the contrib tarball is identical.
|
||||
|
||||
cd /path
|
||||
wget http://download.elasticsearch.org/logstash/logstash/logstash-contrib-%VERSION%.tar.gz
|
||||
tar zxf ~/logstash-contrib-%VERSION%.tar.gz
|
||||
|
||||
This will install the contrib plugins in the same directory as the core
|
||||
install. These plugins will be available to logstash the next time it starts.
|
||||
|
250
docs/docgen.rb
|
@ -1,250 +0,0 @@
|
|||
require "rubygems"
|
||||
require "erb"
|
||||
require "optparse"
|
||||
require "kramdown" # markdown parser
|
||||
|
||||
$: << Dir.pwd
|
||||
$: << File.join(File.dirname(__FILE__), "..", "lib")
|
||||
|
||||
require "logstash/config/mixin"
|
||||
require "logstash/inputs/base"
|
||||
require "logstash/codecs/base"
|
||||
require "logstash/filters/base"
|
||||
require "logstash/outputs/base"
|
||||
require "logstash/version"
|
||||
|
||||
class LogStashConfigDocGenerator
|
||||
COMMENT_RE = /^ *#(?: (.*)| *$)/
|
||||
|
||||
def initialize
|
||||
@rules = {
|
||||
COMMENT_RE => lambda { |m| add_comment(m[1]) },
|
||||
/^ *class.*< *LogStash::(Outputs|Filters|Inputs|Codecs)::(Base|Threadable)/ => \
|
||||
lambda { |m| set_class_description },
|
||||
/^ *config +[^=].*/ => lambda { |m| add_config(m[0]) },
|
||||
/^ *milestone .*/ => lambda { |m| set_milestone(m[0]) },
|
||||
/^ *config_name .*/ => lambda { |m| set_config_name(m[0]) },
|
||||
/^ *flag[( ].*/ => lambda { |m| add_flag(m[0]) },
|
||||
/^ *(class|def|module) / => lambda { |m| clear_comments },
|
||||
}
|
||||
|
||||
if File.exists?("build/contrib_plugins")
|
||||
@contrib_list = File.read("build/contrib_plugins").split("\n")
|
||||
else
|
||||
@contrib_list = []
|
||||
end
|
||||
end
|
||||
|
||||
def parse(string)
|
||||
clear_comments
|
||||
buffer = ""
|
||||
string.split(/\r\n|\n/).each do |line|
|
||||
# Join long lines
|
||||
if line =~ COMMENT_RE
|
||||
# nothing
|
||||
else
|
||||
# Join extended lines
|
||||
if line =~ /(, *$)|(\\$)|(\[ *$)/
|
||||
buffer += line.gsub(/\\$/, "")
|
||||
next
|
||||
end
|
||||
end
|
||||
|
||||
line = buffer + line
|
||||
buffer = ""
|
||||
|
||||
@rules.each do |re, action|
|
||||
m = re.match(line)
|
||||
if m
|
||||
action.call(m)
|
||||
end
|
||||
end # RULES.each
|
||||
end # string.split("\n").each
|
||||
end # def parse
|
||||
|
||||
def set_class_description
|
||||
@class_description = @comments.join("\n")
|
||||
clear_comments
|
||||
end # def set_class_description
|
||||
|
||||
def add_comment(comment)
|
||||
return if comment == "encoding: utf-8"
|
||||
@comments << comment
|
||||
end # def add_comment
|
||||
|
||||
def add_config(code)
|
||||
# I just care about the 'config :name' part
|
||||
code = code.sub(/,.*/, "")
|
||||
|
||||
# call the code, which calls 'config' in this class.
|
||||
# This will let us align comments with config options.
|
||||
name, opts = eval(code)
|
||||
|
||||
# TODO(sissel): This hack is only required until regexp configs
|
||||
# are gone from logstash.
|
||||
name = name.to_s unless name.is_a?(Regexp)
|
||||
|
||||
description = Kramdown::Document.new(@comments.join("\n")).to_html
|
||||
@attributes[name][:description] = description
|
||||
clear_comments
|
||||
end # def add_config
|
||||
|
||||
def add_flag(code)
|
||||
# call the code, which calls 'config' in this class.
|
||||
# This will let us align comments with config options.
|
||||
#p :code => code
|
||||
fixed_code = code.gsub(/ do .*/, "")
|
||||
#p :fixedcode => fixed_code
|
||||
name, description = eval(fixed_code)
|
||||
@flags[name] = description
|
||||
clear_comments
|
||||
end # def add_flag
|
||||
|
||||
def set_config_name(code)
|
||||
name = eval(code)
|
||||
@name = name
|
||||
end # def set_config_name
|
||||
|
||||
def set_milestone(code)
|
||||
@milestone = eval(code)
|
||||
end
|
||||
|
||||
# pretend to be the config DSL and just get the name
|
||||
def config(name, opts={})
|
||||
return name, opts
|
||||
end # def config
|
||||
|
||||
# Pretend to support the flag DSL
|
||||
def flag(*args, &block)
|
||||
name = args.first
|
||||
description = args.last
|
||||
return name, description
|
||||
end # def config
|
||||
|
||||
# pretend to be the config dsl's 'config_name' method
|
||||
def config_name(name)
|
||||
return name
|
||||
end # def config_name
|
||||
|
||||
# pretend to be the config dsl's 'milestone' method
|
||||
def milestone(m)
|
||||
return m
|
||||
end # def milestone
|
||||
|
||||
def clear_comments
|
||||
@comments.clear
|
||||
end # def clear_comments
|
||||
|
||||
def generate(file, settings)
|
||||
@class_description = ""
|
||||
@milestone = ""
|
||||
@comments = []
|
||||
@attributes = Hash.new { |h,k| h[k] = {} }
|
||||
@flags = {}
|
||||
|
||||
# local scoping for the monkeypatch belowg
|
||||
attributes = @attributes
|
||||
# Monkeypatch the 'config' method to capture
|
||||
# Note, this monkeypatch requires us do the config processing
|
||||
# one at a time.
|
||||
#LogStash::Config::Mixin::DSL.instance_eval do
|
||||
#define_method(:config) do |name, opts={}|
|
||||
#p name => opts
|
||||
#attributes[name].merge!(opts)
|
||||
#end
|
||||
#end
|
||||
|
||||
# Loading the file will trigger the config dsl which should
|
||||
# collect all the config settings.
|
||||
load file
|
||||
|
||||
# parse base first
|
||||
parse(File.new(File.join(File.dirname(file), "base.rb"), "r").read)
|
||||
|
||||
# Now parse the real library
|
||||
code = File.new(file).read
|
||||
|
||||
# inputs either inherit from Base or Threadable.
|
||||
if code =~ /\< LogStash::Inputs::Threadable/
|
||||
parse(File.new(File.join(File.dirname(file), "threadable.rb"), "r").read)
|
||||
end
|
||||
|
||||
if code =~ /include LogStash::PluginMixins/
|
||||
mixin = code.gsub(/.*include LogStash::PluginMixins::(\w+)\s.*/m, '\1')
|
||||
mixin.gsub!(/(.)([A-Z])/, '\1_\2')
|
||||
mixin.downcase!
|
||||
parse(File.new(File.join(File.dirname(file), "..", "plugin_mixins", "#{mixin}.rb")).read)
|
||||
end
|
||||
|
||||
parse(code)
|
||||
|
||||
puts "Generating docs for #{file}"
|
||||
|
||||
if @name.nil?
|
||||
$stderr.puts "Missing 'config_name' setting in #{file}?"
|
||||
return nil
|
||||
end
|
||||
|
||||
klass = LogStash::Config::Registry.registry[@name]
|
||||
if klass.ancestors.include?(LogStash::Inputs::Base)
|
||||
section = "input"
|
||||
elsif klass.ancestors.include?(LogStash::Filters::Base)
|
||||
section = "filter"
|
||||
elsif klass.ancestors.include?(LogStash::Outputs::Base)
|
||||
section = "output"
|
||||
elsif klass.ancestors.include?(LogStash::Codecs::Base)
|
||||
section = "codec"
|
||||
end
|
||||
|
||||
template_file = File.join(File.dirname(__FILE__), "plugin-doc.html.erb")
|
||||
template = ERB.new(File.new(template_file).read, nil, "-")
|
||||
|
||||
is_contrib_plugin = @contrib_list.include?(file)
|
||||
|
||||
# descriptions are assumed to be markdown
|
||||
description = Kramdown::Document.new(@class_description).to_html
|
||||
|
||||
klass.get_config.each do |name, settings|
|
||||
@attributes[name].merge!(settings)
|
||||
end
|
||||
sorted_attributes = @attributes.sort { |a,b| a.first.to_s <=> b.first.to_s }
|
||||
klassname = LogStash::Config::Registry.registry[@name].to_s
|
||||
name = @name
|
||||
|
||||
synopsis_file = File.join(File.dirname(__FILE__), "plugin-synopsis.html.erb")
|
||||
synopsis = ERB.new(File.new(synopsis_file).read, nil, "-").result(binding)
|
||||
|
||||
if settings[:output]
|
||||
dir = File.join(settings[:output], section + "s")
|
||||
path = File.join(dir, "#{name}.html")
|
||||
Dir.mkdir(settings[:output]) if !File.directory?(settings[:output])
|
||||
Dir.mkdir(dir) if !File.directory?(dir)
|
||||
File.open(path, "w") do |out|
|
||||
html = template.result(binding)
|
||||
html.gsub!("%VERSION%", LOGSTASH_VERSION)
|
||||
html.gsub!("%PLUGIN%", @name)
|
||||
out.puts(html)
|
||||
end
|
||||
else
|
||||
puts template.result(binding)
|
||||
end
|
||||
end # def generate
|
||||
|
||||
end # class LogStashConfigDocGenerator
|
||||
|
||||
if __FILE__ == $0
|
||||
opts = OptionParser.new
|
||||
settings = {}
|
||||
opts.on("-o DIR", "--output DIR",
|
||||
"Directory to output to; optional. If not specified,"\
|
||||
"we write to stdout.") do |val|
|
||||
settings[:output] = val
|
||||
end
|
||||
|
||||
args = opts.parse(ARGV)
|
||||
|
||||
args.each do |arg|
|
||||
gen = LogStashConfigDocGenerator.new
|
||||
gen.generate(arg, settings)
|
||||
end
|
||||
end
|
|
@ -1,108 +0,0 @@
|
|||
---
|
||||
title: How to extend - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Add a new filter
|
||||
|
||||
This document shows you how to add a new filter to logstash.
|
||||
|
||||
For a general overview of how to add a new plugin, see [the extending
|
||||
logstash](.) overview.
|
||||
|
||||
## Write code.
|
||||
|
||||
Let's write a 'hello world' filter. This filter will replace the 'message' in
|
||||
the event with "Hello world!"
|
||||
|
||||
First, logstash expects plugins in a certain directory structure: `logstash/TYPE/PLUGIN_NAME.rb`
|
||||
|
||||
Since we're creating a filter, let's mkdir this:
|
||||
|
||||
mkdir -p logstash/filters/
|
||||
cd logstash/filters
|
||||
|
||||
Now add the code:
|
||||
|
||||
# Call this file 'foo.rb' (in logstash/filters, as above)
|
||||
require "logstash/filters/base"
|
||||
require "logstash/namespace"
|
||||
|
||||
class LogStash::Filters::Foo < LogStash::Filters::Base
|
||||
|
||||
# Setting the config_name here is required. This is how you
|
||||
# configure this filter from your logstash config.
|
||||
#
|
||||
# filter {
|
||||
# foo { ... }
|
||||
# }
|
||||
config_name "foo"
|
||||
|
||||
# New plugins should start life at milestone 1.
|
||||
milestone 1
|
||||
|
||||
# Replace the message with this value.
|
||||
config :message, :validate => :string
|
||||
|
||||
public
|
||||
def register
|
||||
# nothing to do
|
||||
end # def register
|
||||
|
||||
public
|
||||
def filter(event)
|
||||
# return nothing unless there's an actual filter event
|
||||
return unless filter?(event)
|
||||
if @message
|
||||
# Replace the event message with our message as configured in the
|
||||
# config file.
|
||||
event["message"] = @message
|
||||
end
|
||||
# filter_matched should go in the last line of our successful code
|
||||
filter_matched(event)
|
||||
end # def filter
|
||||
end # class LogStash::Filters::Foo
|
||||
|
||||
## Add it to your configuration
|
||||
|
||||
For this simple example, let's just use stdin input and stdout output.
|
||||
The config file looks like this:
|
||||
|
||||
input {
|
||||
stdin { type => "foo" }
|
||||
}
|
||||
filter {
|
||||
if [type] == "foo" {
|
||||
foo {
|
||||
message => "Hello world!"
|
||||
}
|
||||
}
|
||||
}
|
||||
output {
|
||||
stdout { }
|
||||
}
|
||||
|
||||
Call this file 'example.conf'
|
||||
|
||||
## Tell logstash about it.
|
||||
|
||||
Depending on how you installed logstash, you have a few ways of including this
|
||||
plugin.
|
||||
|
||||
You can use the agent flag --pluginpath flag to specify where the root of your
|
||||
plugin tree is. In our case, it's the current directory.
|
||||
|
||||
% bin/logstash --pluginpath your/plugin/root -f example.conf
|
||||
|
||||
## Example running
|
||||
|
||||
In the example below, I typed in "the quick brown fox" after running the java
|
||||
command.
|
||||
|
||||
% bin/logstash -f example.conf
|
||||
the quick brown fox
|
||||
2011-05-12T01:05:09.495000Z stdin://snack.home/: Hello world!
|
||||
|
||||
The output is the standard logstash stdout output, but in this case our "the
|
||||
quick brown fox" message was replaced with "Hello world!"
|
||||
|
||||
All done! :)
|
|
@ -1,91 +0,0 @@
|
|||
---
|
||||
title: How to extend - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Extending logstash
|
||||
|
||||
You can add your own input, output, or filter plugins to logstash.
|
||||
|
||||
If you're looking to extend logstash today, please look at the existing plugins.
|
||||
|
||||
## Good examples of plugins
|
||||
|
||||
* [inputs/tcp](https://github.com/logstash/logstash/blob/master/lib/logstash/inputs/tcp.rb)
|
||||
* [filters/multiline](https://github.com/logstash/logstash/blob/master/lib/logstash/filters/multiline.rb)
|
||||
* [outputs/mongodb](https://github.com/logstash/logstash/blob/master/lib/logstash/outputs/mongodb.rb)
|
||||
|
||||
## Common concepts
|
||||
|
||||
* The `config_name` sets the name used in the config file.
|
||||
* The `milestone` sets the milestone number of the plugin. See <../plugin-milestones> for more info.
|
||||
* The `config` lines define config options.
|
||||
* The `register` method is called per plugin instantiation. Do any of your initialization here.
|
||||
|
||||
### Required modules
|
||||
|
||||
All plugins should require the Logstash module.
|
||||
|
||||
require 'logstash/namespace'
|
||||
|
||||
### Plugin name
|
||||
|
||||
Every plugin must have a name set with the `config_name` method. If this
|
||||
is not specified plugins will fail to load with an error.
|
||||
|
||||
### Milestones
|
||||
|
||||
Every plugin needs a milestone set using `milestone`. See
|
||||
<../plugin-milestones> for more info.
|
||||
|
||||
### Config lines
|
||||
|
||||
The `config` lines define configuration options and are constructed like
|
||||
so:
|
||||
|
||||
config :host, :validate => :string, :default => "0.0.0.0"
|
||||
|
||||
The name of the option is specified, here `:host` and then the
|
||||
attributes of the option. They can include `:validate`, `:default`,
|
||||
`:required` (a Boolean `true` or `false`), `:deprecated` (also a
|
||||
Boolean), and `:obsolete` (a String value).
|
||||
|
||||
## Inputs
|
||||
|
||||
All inputs require the LogStash::Inputs::Base class:
|
||||
|
||||
require 'logstash/inputs/base'
|
||||
|
||||
Inputs have two methods: `register` and `run`.
|
||||
|
||||
* Each input runs as its own thread.
|
||||
* The `run` method is expected to run-forever.
|
||||
|
||||
## Filters
|
||||
|
||||
All filters require the LogStash::Filters::Base class:
|
||||
|
||||
require 'logstash/filters/base'
|
||||
|
||||
Filters have two methods: `register` and `filter`.
|
||||
|
||||
* The `filter` method gets an event.
|
||||
* Call `event.cancel` to drop the event.
|
||||
* To modify an event, simply make changes to the event you are given.
|
||||
* The return value is ignored.
|
||||
|
||||
## Outputs
|
||||
|
||||
All outputs require the LogStash::Outputs::Base class:
|
||||
|
||||
require 'logstash/outputs/base'
|
||||
|
||||
Outputs have two methods: `register` and `receive`.
|
||||
|
||||
* The `register` method is called per plugin instantiation. Do any of your initialization here.
|
||||
* The `receive` method is called when an event gets pushed to your output
|
||||
|
||||
## Example: a new filter
|
||||
|
||||
Learn by example how to [add a new filter to logstash](example-add-a-new-filter)
|
||||
|
||||
|
|
@ -1,45 +0,0 @@
|
|||
---
|
||||
title: Command-line flags - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Command-line flags
|
||||
|
||||
## Agent
|
||||
|
||||
The logstash agent has the following flags (also try using the '--help' flag)
|
||||
|
||||
<dl>
|
||||
<dt> -f, --config CONFIGFILE </dt>
|
||||
<dd> Load the logstash config from a specific file, directory, or a
|
||||
wildcard. If given a directory or wildcard, config files will be read
|
||||
from the directory in alphabetical order. </dd>
|
||||
<dt> -e CONFIGSTRING </dt>
|
||||
<dd> Use the given string as the configuration data. Same syntax as the
|
||||
config file. If not input is specified, 'stdin { type => stdin }' is
|
||||
default. If no output is specified, 'stdout { debug => true }}' is
|
||||
default. </dd>
|
||||
<dt> -w, --filterworkers COUNT </dt>
|
||||
<dd> Run COUNT filter workers (default: 1) </dd>
|
||||
<dt> -l, --log FILE </dt>
|
||||
<dd> Log to a given path. Default is to log to stdout </dd>
|
||||
<dt> --verbose </dt>
|
||||
<dd> Increase verbosity to the first level, less verbose.</dd>
|
||||
<dt> --debug </dt>
|
||||
<dd> Increase verbosity to the last level, more verbose.</dd>
|
||||
<dt> -v </dt>
|
||||
<dd> *DEPRECATED: see --verbose/debug* Increase verbosity. There are multiple levels of verbosity available with
|
||||
'-vv' currently being the highest </dd>
|
||||
<dt> --pluginpath PLUGIN_PATH </dt>
|
||||
<dd> A colon-delimted path to find other logstash plugins in </dd>
|
||||
</dl>
|
||||
|
||||
|
||||
## Web
|
||||
|
||||
<dl>
|
||||
<dt> -a, --address ADDRESS </dt>
|
||||
<dd>Address on which to start webserver. Default is 0.0.0.0.</dd>
|
||||
<dt> -p, --port PORT</dt>
|
||||
<dd>Port on which to start webserver. Default is 9292.</dd>
|
||||
</dl>
|
||||
|
|
@ -1,28 +0,0 @@
|
|||
#!/usr/bin/env ruby
|
||||
|
||||
require "erb"
|
||||
|
||||
if ARGV.size != 1
|
||||
$stderr.puts "No path given to search for plugin docs"
|
||||
$stderr.puts "Usage: #{$0} plugin_doc_dir"
|
||||
exit 1
|
||||
end
|
||||
|
||||
def plugins(glob)
|
||||
files = Dir.glob(glob)
|
||||
names = files.collect { |f| File.basename(f).gsub(".html", "") }
|
||||
return names.sort
|
||||
end # def plugins
|
||||
|
||||
basedir = ARGV[0]
|
||||
docs = {
|
||||
"inputs" => plugins(File.join(basedir, "inputs/*.html")),
|
||||
"codecs" => plugins(File.join(basedir, "codecs/*.html")),
|
||||
"filters" => plugins(File.join(basedir, "filters/*.html")),
|
||||
"outputs" => plugins(File.join(basedir, "outputs/*.html")),
|
||||
}
|
||||
|
||||
template_path = File.join(File.dirname(__FILE__), "index.html.erb")
|
||||
template = File.new(template_path).read
|
||||
erb = ERB.new(template, nil, "-")
|
||||
puts erb.result(binding)
|
|
@ -1,46 +0,0 @@
|
|||
---
|
||||
title: Learn - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# What is Logstash?
|
||||
|
||||
Logstash is a tool for managing your logs.
|
||||
|
||||
It helps you take logs and other event data from your systems and move it into
|
||||
a central place. Logstash is open source and completely free. You can find
|
||||
support on the discussion forum and on IRC.
|
||||
|
||||
For an overview of Logstash and why you would use it, you should watch the
|
||||
presentation I gave at CarolinaCon 2011:
|
||||
[video here](http://carolinacon.blip.tv/file/5105901/). This presentation covers
|
||||
Logstash, how you can use it, some alternatives, logging best practices,
|
||||
parsing tools, etc. Video also below:
|
||||
|
||||
<!--
|
||||
<embed src="http://blip.tv/play/gvE9grjcdQI" type="application/x-shockwave-flash" width="480" height="296" allowscriptaccess="always" allowfullscreen="true"></embed>
|
||||
|
||||
The slides are available online here: [slides](http://goo.gl/68c62). The slides
|
||||
include speaker notes (click 'actions' then 'speaker notes').
|
||||
-->
|
||||
<iframe width="480" height="296" src="http://www.youtube.com/embed/RuUFnog29M4" frameborder="0" allowfullscreen="allowfullscreen"></iframe>
|
||||
|
||||
The slides are available online here: [slides](http://semicomplete.com/presentations/logstash-puppetconf-2012/).
|
||||
|
||||
## Getting Help
|
||||
|
||||
There's [documentation](.) here on this site. If that isn't sufficient, you can
|
||||
use the discussion [forum](https://discuss.elastic.co/c/logstash). Further, there is also
|
||||
an IRC channel - #logstash on irc.freenode.org.
|
||||
|
||||
If you find a bug or have a feature request, file them
|
||||
on [github](https://github.com/elasticsearch/logstas/issues). (Honestly though, if you prefer email or irc
|
||||
for such things, that works for me, too.)
|
||||
|
||||
## Download It
|
||||
|
||||
[Download logstash-%VERSION%](https://download.elastic.co/logstash/logstash/logstash-%VERSION%.tar.gz)
|
||||
|
||||
## What's next?
|
||||
|
||||
Try this [guide](tutorials/getting-started-with-logstash) for a simple
|
||||
real-world example getting started using Logstash.
|
|
@ -1,109 +0,0 @@
|
|||
---
|
||||
title: the life of an event - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# the life of an event
|
||||
|
||||
The logstash agent is an event pipeline.
|
||||
|
||||
## The Pipeline
|
||||
|
||||
The logstash agent is a processing pipeline with 3 stages: inputs -> filters ->
|
||||
outputs. Inputs generate events, filters modify them, outputs ship them
|
||||
elsewhere.
|
||||
|
||||
Internal to logstash, events are passed from each phase using internal queues.
|
||||
It is implemented with a 'SizedQueue' in Ruby. SizedQueue allows a bounded
|
||||
maximum of items in the queue such that any writes to the queue will block if
|
||||
the queue is full at maximum capacity.
|
||||
|
||||
Logstash sets each queue size to 20. This means only 20 events can be pending
|
||||
into the next phase - this helps reduce any data loss and in general avoids
|
||||
logstash trying to act as a data storage system. These internal queues are not
|
||||
for storing messages long-term.
|
||||
|
||||
## Fault Tolerance
|
||||
|
||||
Starting at outputs, here's what happens when things break.
|
||||
|
||||
An output can fail or have problems because of some downstream cause, such as
|
||||
full disk, permissions problems, temporary network failures, or service
|
||||
outages. Most outputs should keep retrying to ship any events that were
|
||||
involved in the failure.
|
||||
|
||||
If an output is failing, the output thread will wait until this output is
|
||||
healthy again and able to successfully send the message. Therefore, the output
|
||||
queue will stop being read from by this output and will eventually fill up with
|
||||
events and block new events from being written to this queue.
|
||||
|
||||
A full output queue means filters will block trying to write to the output
|
||||
queue. Because filters will be stuck, blocked writing to the output queue, they
|
||||
will stop reading from the filter queue which will eventually cause the filter
|
||||
queue (input -> filter) to fill up.
|
||||
|
||||
A full filter queue will cause inputs to block when writing to the filters.
|
||||
This will cause each input to block, causing each input to stop processing new
|
||||
data from wherever that input is getting new events.
|
||||
|
||||
In ideal circumstances, this will behave similarly to when the tcp window
|
||||
closes to 0, no new data is sent because the receiver hasn't finished
|
||||
processing the current queue of data, but as soon as the downstream (output)
|
||||
problem is resolved, messages will begin flowing again..
|
||||
|
||||
## Thread Model
|
||||
|
||||
The thread model in logstash is currently:
|
||||
|
||||
input threads | filter worker threads | output worker
|
||||
|
||||
Filters are optional, so you will have this model if you have no filters
|
||||
defined:
|
||||
|
||||
input threads | output worker
|
||||
|
||||
Each input runs in a thread by itself. This allows busier inputs to not be
|
||||
blocked by slower ones, etc. It also allows for easier containment of scope
|
||||
because each input has a thread.
|
||||
|
||||
The filter thread model is a 'worker' model where each worker receives an event
|
||||
and applies all filters, in order, before emitting that to the output queue.
|
||||
This allows scalability across CPUs because many filters are CPU intensive
|
||||
(permitting that we have thread safety).
|
||||
|
||||
The default number of filter workers is 1, but you can increase this number
|
||||
with the '-w' flag on the agent.
|
||||
|
||||
The output worker model is currently a single thread. Outputs will receive
|
||||
events in the order they are defined in the config file.
|
||||
|
||||
Outputs may decide to buffer events temporarily before publishing them,
|
||||
possibly in a separate thread. One example of this is the elasticsearch output
|
||||
which will buffer events and flush them all at once, in a separate thread. This
|
||||
mechanism (buffering many events + writing in a separate thread) can improve
|
||||
performance so the logstash pipeline isn't stalled waiting for a response from
|
||||
elasticsearch.
|
||||
|
||||
## Consequences and Expectations
|
||||
|
||||
Small queue sizes mean that logstash simply blocks and stalls safely during
|
||||
times of load or other temporary pipeline problems. There are two alternatives
|
||||
to this - unlimited queue length and dropping messages. Unlimited queues grow
|
||||
grow unbounded and eventually exceed memory causing a crash which loses all of
|
||||
those messages. Dropping messages is also an undesirable behavior in most cases.
|
||||
|
||||
At a minimum, logstash will have probably 3 threads (2 if you have no filters).
|
||||
One input, one filter worker, and one output thread each.
|
||||
|
||||
If you see logstash using multiple CPUs, this is likely why. If you want to
|
||||
know more about what each thread is doing, you should read this:
|
||||
<http://www.semicomplete.com/blog/geekery/debugging-java-performance.html>.
|
||||
|
||||
Threads in java have names, and you can use jstack and top to figure out who is
|
||||
using what resources. The URL above will help you learn how to do this.
|
||||
|
||||
On Linux platforms, logstash will label all the threads it can with something
|
||||
descriptive. Inputs will show up as "<inputname" and filter workers as
|
||||
"|worker" and outputs as ">outputworker" (or something similar). Other threads
|
||||
may be labeled as well, and are intended to help you identify their purpose
|
||||
should you wonder why they are consuming resources!
|
||||
|
|
@ -1,60 +0,0 @@
|
|||
---
|
||||
title: Logging tools comparisons - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Logging tools comparison
|
||||
|
||||
The information below is provided as "best effort" and is not strictly intended
|
||||
as a complete source of truth. If the information below is unclear or incorrect, please
|
||||
email the logstash-users list (or send a pull request with the fix) :)
|
||||
|
||||
Where feasible, this document will also provide information on how you can use
|
||||
logstash with these other projects.
|
||||
|
||||
# logstash
|
||||
|
||||
Primary goal: Make log/event data and analytics accessible.
|
||||
|
||||
Overview: Where your logs come from, how you store them, or what you do with
|
||||
them is up to you. Logstash exists to help make such actions easier and faster.
|
||||
|
||||
It provides you a simple event pipeline for taking events and logs from any
|
||||
input, manipulating them with filters, and sending them to any output. Inputs
|
||||
can be files, network, message brokers, etc. Filters are date and string
|
||||
parsers, grep-like, etc. Outputs are data stores (elasticsearch, mongodb, etc),
|
||||
message systems (rabbitmq, stomp, etc), network (tcp, syslog), etc.
|
||||
|
||||
It also provides a web interface for doing search and analytics on your
|
||||
logs.
|
||||
|
||||
# graylog2
|
||||
|
||||
[http://graylog2.org/](http://graylog2.org)
|
||||
|
||||
_Overview to be written_
|
||||
|
||||
You can use graylog2 with logstash by using the 'gelf' output to send logstash
|
||||
events to a graylog2 server. This gives you logstash's excellent input and
|
||||
filter features while still being able to use the graylog2 web interface.
|
||||
|
||||
# whoops
|
||||
|
||||
[whoops site](http://www.whoopsapp.com/)
|
||||
|
||||
_Overview to be written_
|
||||
|
||||
A logstash output to whoops is coming soon - <https://logstash.jira.com/browse/LOGSTASH-133>
|
||||
|
||||
# flume
|
||||
|
||||
[flume site](https://github.com/cloudera/flume/wiki)
|
||||
|
||||
Flume is primarily a transport system aimed at reliably copying logs from
|
||||
application servers to HDFS.
|
||||
|
||||
You can use it with logstash by having a syslog sink configured to shoot logs
|
||||
at a logstash syslog input.
|
||||
|
||||
# scribe
|
||||
|
||||
_Overview to be written_
|
|
@ -1,41 +0,0 @@
|
|||
---
|
||||
title: Plugin Milestones - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Plugin Milestones
|
||||
|
||||
Plugins (inputs/outputs/filters/codecs) have a milestone label in logstash.
|
||||
This is to provide an indicator to the end-user as to the kinds of changes
|
||||
a given plugin could have between logstash releases.
|
||||
|
||||
The desire here is to allow plugin developers to quickly iterate on possible
|
||||
new plugins while conveying to the end-user a set of expectations about that
|
||||
plugin.
|
||||
|
||||
## Milestone 1
|
||||
|
||||
Plugins at this milestone need your feedback to improve! Plugins at this
|
||||
milestone may change between releases as the community figures out the best way
|
||||
for the plugin to behave and be configured.
|
||||
|
||||
## Milestone 2
|
||||
|
||||
Plugins at this milestone are more likely to have backwards-compatibility to
|
||||
previous releases than do Milestone 1 plugins. This milestone also indicates
|
||||
a greater level of in-the-wild usage by the community than the previous
|
||||
milestone.
|
||||
|
||||
## Milestone 3
|
||||
|
||||
Plugins at this milestone have strong promises towards backwards-compatibility.
|
||||
This is enforced with automated tests to ensure behavior and configuration are
|
||||
consistent across releases.
|
||||
|
||||
## Milestone 0
|
||||
|
||||
This milestone appears at the bottom of the page because it is very
|
||||
infrequently used.
|
||||
|
||||
This milestone marker is used to generally indicate that a plugin has no
|
||||
active code maintainer nor does it have support from the community in terms
|
||||
of getting help.
|
|
@ -1,64 +0,0 @@
|
|||
---
|
||||
title: release notes for %VERSION%
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
# %VERSION% - Release Notes
|
||||
|
||||
This document is targeted at existing users of Logstash who are upgrading from
|
||||
an older version to version %VERSION%. This document is intended to supplement
|
||||
a the [changelog
|
||||
file](https://github.com/elasticsearch/logstash/blob/v%VERSION%/CHANGELOG) by
|
||||
providing more details on certain changes.
|
||||
|
||||
### tarball
|
||||
|
||||
With Logstash 1.4.0, we stopped shipping the jar file and started shipping a
|
||||
tarball instead.
|
||||
|
||||
Past releases have been a single jar file which included all Ruby and Java
|
||||
library dependencies to eliminate deployment pains. We still ship all
|
||||
the dependencies for you! The jar file served us well, but over time we found
|
||||
Java’s default heap size, garbage collector, and other settings weren’t well
|
||||
suited to Logstash.
|
||||
|
||||
In order to provide better Java defaults, we’ve changed to releasing a tarball
|
||||
(.tar.gz) that includes all the same dependencies. What does this mean to you?
|
||||
Instead of running `java -jar logstash.jar ...` you run `bin/logstash ...` (for
|
||||
Windows users, `bin/logstash.bat`)
|
||||
|
||||
One pleasant side effect of using a tarball is that the Logstash code itself is
|
||||
much more accessible and able to satisfy any curiosity you may have.
|
||||
|
||||
The new way to do things is:
|
||||
|
||||
* Download logstash tarball
|
||||
* Unpack it (`tar -zxf logstash-%VERSION%.tar.gz`)
|
||||
* `cd logstash-%VERSION%`
|
||||
% Run it: `bin/logstash ...`
|
||||
|
||||
The old way to run logstash of `java -jar logstash.jar` is now replaced with
|
||||
`bin/logstash`. The command line arguments are exactly the same after that.
|
||||
For example:
|
||||
|
||||
# Old way:
|
||||
`% java -jar logstash-1.3.3-flatjar.jar agent -f logstash.conf`
|
||||
|
||||
# New way:
|
||||
`% bin/logstash agent -f logstash.conf`
|
||||
|
||||
### plugins
|
||||
|
||||
Logstash has grown brilliantly over the past few years with great contributions
|
||||
from the community. Now having 165 plugins, it became hard for us (the Logstash
|
||||
engineering team) to reliably support all the wonderful technologies in each
|
||||
contributed plugin. We combed through all the plugins and picked the ones we
|
||||
felt strongly we could support, and those now ship by default with Logstash.
|
||||
|
||||
All the other plugins are now available in a contrib package. All plugins
|
||||
continue to be open source and free, of course! Installing plugins is very easy:
|
||||
|
||||
....
|
||||
% cd /path/to/logstash-%VERSION%/
|
||||
% bin/plugin install [PLUGIN_NAME]
|
||||
....
|
|
@ -1,35 +0,0 @@
|
|||
---
|
||||
title: repositories - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Logstash repositories
|
||||
|
||||
We also have Logstash available as APT and YUM repositories.
|
||||
|
||||
Our public signing key can be found on the [Elasticsearch packages apt GPG signing key page](https://packages.elasticsearch.org/GPG-KEY-elasticsearch)
|
||||
|
||||
## Apt based distributions
|
||||
|
||||
Add the key:
|
||||
|
||||
wget -O - https://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add -
|
||||
|
||||
Add the repo to /etc/apt/sources.list
|
||||
|
||||
deb http://packages.elasticsearch.org/logstash/1.4/debian stable main
|
||||
|
||||
|
||||
## YUM based distributions
|
||||
|
||||
Add the key:
|
||||
|
||||
rpm --import https://packages.elasticsearch.org/GPG-KEY-elasticsearch
|
||||
|
||||
Add the repo to /etc/yum.repos.d/ directory
|
||||
|
||||
[logstash-1.4]
|
||||
name=logstash repository for 1.4.x packages
|
||||
baseurl=https://packages.elasticsearch.org/logstash/1.4/centos
|
||||
gpgcheck=1
|
||||
gpgkey=https://packages.elasticsearch.org/GPG-KEY-elasticsearch
|
||||
enabled=1
|
|
@ -1,14 +1,14 @@
|
|||
[[advanced-pipeline]]
|
||||
=== Setting Up an Advanced Logstash Pipeline
|
||||
|
||||
A Logstash pipeline in most use cases has one or more input, filter, and output plugins. The scenarios in this section
|
||||
A Logstash pipeline in most use cases has one or more input, filter, and output plugins. The scenarios in this section
|
||||
build Logstash configuration files to specify these plugins and discuss what each plugin is doing.
|
||||
|
||||
The Logstash configuration file defines your _Logstash pipeline_. When you start a Logstash instance, use the
|
||||
The Logstash configuration file defines your _Logstash pipeline_. When you start a Logstash instance, use the
|
||||
`-f <path/to/file>` option to specify the configuration file that defines that instance’s pipeline.
|
||||
|
||||
A Logstash pipeline has two required elements, `input` and `output`, and one optional element, `filter`. The input
|
||||
plugins consume data from a source, the filter plugins modify the data as you specify, and the output plugins write
|
||||
A Logstash pipeline has two required elements, `input` and `output`, and one optional element, `filter`. The input
|
||||
plugins consume data from a source, the filter plugins modify the data as you specify, and the output plugins write
|
||||
the data to a destination.
|
||||
|
||||
image::static/images/basic_logstash_pipeline.png[]
|
||||
|
@ -24,13 +24,13 @@ input {
|
|||
# The filter part of this file is commented out to indicate that it is
|
||||
# optional.
|
||||
# filter {
|
||||
#
|
||||
#
|
||||
# }
|
||||
output {
|
||||
}
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
This skeleton is non-functional, because the input and output sections don’t have any valid options defined. The
|
||||
This skeleton is non-functional, because the input and output sections don’t have any valid options defined. The
|
||||
examples in this tutorial build configuration files to address specific use cases.
|
||||
|
||||
Paste the skeleton into a file named `first-pipeline.conf` in your home Logstash directory.
|
||||
|
@ -38,17 +38,17 @@ Paste the skeleton into a file named `first-pipeline.conf` in your home Logstash
|
|||
[[parsing-into-es]]
|
||||
==== Parsing Apache Logs into Elasticsearch
|
||||
|
||||
This example creates a Logstash pipeline that takes Apache web logs as input, parses those logs to create specific,
|
||||
This example creates a Logstash pipeline that takes Apache web logs as input, parses those logs to create specific,
|
||||
named fields from the logs, and writes the parsed data to an Elasticsearch cluster.
|
||||
|
||||
You can download the sample data set used in this example
|
||||
You can download the sample data set used in this example
|
||||
https://download.elastic.co/demos/logstash/gettingstarted/logstash-tutorial.log.gz[here]. Unpack this file.
|
||||
|
||||
[float]
|
||||
[[configuring-file-input]]
|
||||
==== Configuring Logstash for File Input
|
||||
|
||||
To start your Logstash pipeline, configure the Logstash instance to read from a file using the
|
||||
To start your Logstash pipeline, configure the Logstash instance to read from a file using the
|
||||
{logstash}plugins-inputs-file.html[file] input plugin.
|
||||
|
||||
Edit the `first-pipeline.conf` file to add the following text:
|
||||
|
@ -63,8 +63,8 @@ input {
|
|||
}
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
<1> The default behavior of the file input plugin is to monitor a file for new information, in a manner similar to the
|
||||
UNIX `tail -f` command. To change this default behavior and process the entire file, we need to specify the position
|
||||
<1> The default behavior of the file input plugin is to monitor a file for new information, in a manner similar to the
|
||||
UNIX `tail -f` command. To change this default behavior and process the entire file, we need to specify the position
|
||||
where Logstash starts processing the file.
|
||||
|
||||
Replace `/path/to/` with the actual path to the location of `logstash-tutorial.log` in your file system.
|
||||
|
@ -73,22 +73,22 @@ Replace `/path/to/` with the actual path to the location of `logstash-tutorial.l
|
|||
[[configuring-grok-filter]]
|
||||
===== Parsing Web Logs with the Grok Filter Plugin
|
||||
|
||||
The {logstash}plugins-filters-grok.html[`grok`] filter plugin is one of several plugins that are available by default in
|
||||
Logstash. For details on how to manage Logstash plugins, see the <<working-with-plugins,reference documentation>> for
|
||||
The {logstash}plugins-filters-grok.html[`grok`] filter plugin is one of several plugins that are available by default in
|
||||
Logstash. For details on how to manage Logstash plugins, see the <<working-with-plugins,reference documentation>> for
|
||||
the plugin manager.
|
||||
|
||||
Because the `grok` filter plugin looks for patterns in the incoming log data, configuration requires you to make
|
||||
decisions about how to identify the patterns that are of interest to your use case. A representative line from the web
|
||||
Because the `grok` filter plugin looks for patterns in the incoming log data, configuration requires you to make
|
||||
decisions about how to identify the patterns that are of interest to your use case. A representative line from the web
|
||||
server log sample looks like this:
|
||||
|
||||
[source,shell]
|
||||
--------------------------------------------------------------------------------
|
||||
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png
|
||||
HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel
|
||||
83.149.9.216 - - [04/Jan/2015:05:13:42 +0000] "GET /presentations/logstash-monitorama-2013/images/kibana-search.png
|
||||
HTTP/1.1" 200 203023 "http://semicomplete.com/presentations/logstash-monitorama-2013/" "Mozilla/5.0 (Macintosh; Intel
|
||||
Mac OS X 10_9_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/32.0.1700.77 Safari/537.36"
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
The IP address at the beginning of the line is easy to identify, as is the timestamp in brackets. In this tutorial, use
|
||||
The IP address at the beginning of the line is easy to identify, as is the timestamp in brackets. In this tutorial, use
|
||||
the `%{COMBINEDAPACHELOG}` grok pattern, which structures lines from the Apache log using the following schema:
|
||||
|
||||
[horizontal]
|
||||
|
@ -123,7 +123,7 @@ After processing, the sample line has the following JSON representation:
|
|||
{
|
||||
"clientip" : "83.149.9.216",
|
||||
"ident" : ,
|
||||
"auth" : ,
|
||||
"auth" : ,
|
||||
"timestamp" : "04/Jan/2015:05:13:42 +0000",
|
||||
"verb" : "GET",
|
||||
"request" : "/presentations/logstash-monitorama-2013/images/kibana-search.png",
|
||||
|
@ -139,7 +139,7 @@ After processing, the sample line has the following JSON representation:
|
|||
[[indexing-parsed-data-into-elasticsearch]]
|
||||
===== Indexing Parsed Data into Elasticsearch
|
||||
|
||||
Now that the web logs are broken down into specific fields, the Logstash pipeline can index the data into an
|
||||
Now that the web logs are broken down into specific fields, the Logstash pipeline can index the data into an
|
||||
Elasticsearch cluster. Edit the `first-pipeline.conf` file to add the following text after the `input` section:
|
||||
|
||||
[source,json]
|
||||
|
@ -152,17 +152,17 @@ output {
|
|||
|
||||
With this configuration, Logstash uses http protocol to connect to Elasticsearch. The above example assumes Logstash
|
||||
and Elasticsearch to be running on the same instance. You can specify a remote Elasticsearch instance using `hosts`
|
||||
configuration like `hosts => "es-machine:9092"`.
|
||||
configuration like `hosts => "es-machine:9092"`.
|
||||
|
||||
[float]
|
||||
[[configuring-geoip-plugin]]
|
||||
===== Enhancing Your Data with the Geoip Filter Plugin
|
||||
|
||||
In addition to parsing log data for better searches, filter plugins can derive supplementary information from existing
|
||||
data. As an example, the {logstash}plugins-filters-geoip.html[`geoip`] plugin looks up IP addresses, derives geographic
|
||||
In addition to parsing log data for better searches, filter plugins can derive supplementary information from existing
|
||||
data. As an example, the {logstash}plugins-filters-geoip.html[`geoip`] plugin looks up IP addresses, derives geographic
|
||||
location information from the addresses, and adds that location information to the logs.
|
||||
|
||||
Configure your Logstash instance to use the `geoip` filter plugin by adding the following lines to the `filter` section
|
||||
Configure your Logstash instance to use the `geoip` filter plugin by adding the following lines to the `filter` section
|
||||
of the `first-pipeline.conf` file:
|
||||
|
||||
[source,json]
|
||||
|
@ -172,7 +172,7 @@ geoip {
|
|||
}
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
The `geoip` plugin configuration requires data that is already defined as separate fields. Make sure that the `geoip`
|
||||
The `geoip` plugin configuration requires data that is already defined as separate fields. Make sure that the `geoip`
|
||||
section is after the `grok` section of the configuration file.
|
||||
|
||||
Specify the name of the field that contains the IP address to look up. In this tutorial, the field name is `clientip`.
|
||||
|
@ -335,11 +335,11 @@ Only one of the log entries comes from Buffalo, so the query produces a single r
|
|||
[[multiple-input-output-plugins]]
|
||||
==== Multiple Input and Output Plugins
|
||||
|
||||
The information you need to manage often comes from several disparate sources, and use cases can require multiple
|
||||
destinations for your data. Your Logstash pipeline can use multiple input and output plugins to handle these
|
||||
The information you need to manage often comes from several disparate sources, and use cases can require multiple
|
||||
destinations for your data. Your Logstash pipeline can use multiple input and output plugins to handle these
|
||||
requirements.
|
||||
|
||||
This example creates a Logstash pipeline that takes input from a Twitter feed and the Filebeat client, then
|
||||
This example creates a Logstash pipeline that takes input from a Twitter feed and the Filebeat client, then
|
||||
sends the information to an Elasticsearch cluster as well as writing the information directly to a file.
|
||||
|
||||
[float]
|
||||
|
@ -354,7 +354,7 @@ To add a Twitter feed, you need several pieces of information:
|
|||
* An _oauth token_, which identifies the Twitter account using this app.
|
||||
* An _oauth token secret_, which serves as the password of the Twitter account.
|
||||
|
||||
Visit https://dev.twitter.com/apps to set up a Twitter account and generate your consumer key and secret, as well as
|
||||
Visit https://dev.twitter.com/apps to set up a Twitter account and generate your consumer key and secret, as well as
|
||||
your OAuth token and secret.
|
||||
|
||||
Use this information to add the following lines to the `input` section of the `first-pipeline.conf` file:
|
||||
|
@ -366,7 +366,7 @@ twitter {
|
|||
consumer_secret =>
|
||||
keywords =>
|
||||
oauth_token =>
|
||||
oauth_token_secret =>
|
||||
oauth_token_secret =>
|
||||
}
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
|
@ -374,19 +374,19 @@ twitter {
|
|||
[[configuring-lsf]]
|
||||
==== The Filebeat Client
|
||||
|
||||
The https://github.com/elastic/beats/tree/master/filebeat[filebeat] client is a lightweight, resource-friendly tool that
|
||||
collects logs from files on the server and forwards these logs to your Logstash instance for processing. The
|
||||
Filebeat client uses the secure Beats protocol to communicate with your Logstash instance. The
|
||||
lumberjack protocol is designed for reliability and low latency. Filebeat uses the computing resources of
|
||||
the machine hosting the source data, and the {logstash}plugins-inputs-beats.html[Beats input] plugin minimizes the
|
||||
The https://github.com/elastic/beats/tree/master/filebeat[filebeat] client is a lightweight, resource-friendly tool that
|
||||
collects logs from files on the server and forwards these logs to your Logstash instance for processing. The
|
||||
Filebeat client uses the secure Beats protocol to communicate with your Logstash instance. The
|
||||
lumberjack protocol is designed for reliability and low latency. Filebeat uses the computing resources of
|
||||
the machine hosting the source data, and the {logstash}plugins-inputs-beats.html[Beats input] plugin minimizes the
|
||||
resource demands on the Logstash instance.
|
||||
|
||||
NOTE: In a typical use case, Filebeat runs on a separate machine from the machine running your
|
||||
NOTE: In a typical use case, Filebeat runs on a separate machine from the machine running your
|
||||
Logstash instance. For the purposes of this tutorial, Logstash and Filebeat are running on the
|
||||
same machine.
|
||||
|
||||
Default Logstash configuration includes the {logstash}plugins-inputs-beats.html[Beats input plugin], which is
|
||||
designed to be resource-friendly. To install Filebeat on your data source machine, download the
|
||||
Default Logstash configuration includes the {logstash}plugins-inputs-beats.html[Beats input plugin], which is
|
||||
designed to be resource-friendly. To install Filebeat on your data source machine, download the
|
||||
appropriate package from the Filebeat https://www.elastic.co/downloads/beats/filebeat[product page].
|
||||
|
||||
Create a configuration file for Filebeat similar to the following example:
|
||||
|
@ -414,9 +414,9 @@ output:
|
|||
<2> Path to the SSL certificate for the Logstash instance.
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Save this configuration file as `filebeat.yml`.
|
||||
Save this configuration file as `filebeat.yml`.
|
||||
|
||||
Configure your Logstash instance to use the Filebeat input plugin by adding the following lines to the `input` section
|
||||
Configure your Logstash instance to use the Filebeat input plugin by adding the following lines to the `input` section
|
||||
of the `first-pipeline.conf` file:
|
||||
|
||||
[source,json]
|
||||
|
@ -436,10 +436,10 @@ beats {
|
|||
[[logstash-file-output]]
|
||||
==== Writing Logstash Data to a File
|
||||
|
||||
You can configure your Logstash pipeline to write data directly to a file with the
|
||||
You can configure your Logstash pipeline to write data directly to a file with the
|
||||
{logstash}plugins-outputs-file.html[`file`] output plugin.
|
||||
|
||||
Configure your Logstash instance to use the `file` output plugin by adding the following lines to the `output` section
|
||||
Configure your Logstash instance to use the `file` output plugin by adding the following lines to the `output` section
|
||||
of the `first-pipeline.conf` file:
|
||||
|
||||
[source,json]
|
||||
|
@ -453,7 +453,7 @@ file {
|
|||
[[multiple-es-nodes]]
|
||||
==== Writing to multiple Elasticsearch nodes
|
||||
|
||||
Writing to multiple Elasticsearch nodes lightens the resource demands on a given Elasticsearch node, as well as
|
||||
Writing to multiple Elasticsearch nodes lightens the resource demands on a given Elasticsearch node, as well as
|
||||
providing redundant points of entry into the cluster when a particular node is unavailable.
|
||||
|
||||
To configure your Logstash instance to write to multiple Elasticsearch nodes, edit the output section of the `first-pipeline.conf` file to read:
|
||||
|
@ -467,7 +467,7 @@ output {
|
|||
}
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Use the IP addresses of three non-master nodes in your Elasticsearch cluster in the host line. When the `hosts`
|
||||
Use the IP addresses of three non-master nodes in your Elasticsearch cluster in the host line. When the `hosts`
|
||||
parameter lists multiple IP addresses, Logstash load-balances requests across the list of addresses. Also note that
|
||||
default port for Elasticsearch is `9200` and can be omitted in the configuration above.
|
||||
|
||||
|
@ -504,7 +504,7 @@ output {
|
|||
}
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Logstash is consuming data from the Twitter feed you configured, receiving data from Filebeat, and
|
||||
Logstash is consuming data from the Twitter feed you configured, receiving data from Filebeat, and
|
||||
indexing this information to three nodes in an Elasticsearch cluster as well as writing to a file.
|
||||
|
||||
At the data source machine, run Filebeat with the following command:
|
||||
|
@ -514,7 +514,7 @@ At the data source machine, run Filebeat with the following command:
|
|||
sudo ./filebeat -e -c filebeat.yml -d "publish"
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Filebeat will attempt to connect on port 5403. Until Logstash starts with an active Beats plugin, there
|
||||
Filebeat will attempt to connect on port 5403. Until Logstash starts with an active Beats plugin, there
|
||||
won’t be any answer on that port, so any messages you see regarding failure to connect on that port are normal for now.
|
||||
|
||||
To verify your configuration, run the following command:
|
||||
|
@ -558,17 +558,17 @@ Shutting down a running Logstash instance involves the following steps:
|
|||
The following conditions affect the shutdown process:
|
||||
|
||||
* An input plugin receiving data at a slow pace.
|
||||
* A slow filter, like a Ruby filter executing `sleep(10000)` or an Elasticsearch filter that is executing a very heavy
|
||||
* A slow filter, like a Ruby filter executing `sleep(10000)` or an Elasticsearch filter that is executing a very heavy
|
||||
query.
|
||||
* A disconnected output plugin that is waiting to reconnect to flush in-flight events.
|
||||
|
||||
These situations make the duration and success of the shutdown process unpredictable.
|
||||
|
||||
Logstash has a stall detection mechanism that analyzes the behavior of the pipeline and plugins during shutdown.
|
||||
This mechanism produces periodic information about the count of inflight events in internal queues and a list of busy
|
||||
This mechanism produces periodic information about the count of inflight events in internal queues and a list of busy
|
||||
worker threads.
|
||||
|
||||
To enable Logstash to forcibly terminate in the case of a stalled shutdown, use the `--allow-unsafe-shutdown` flag when
|
||||
To enable Logstash to forcibly terminate in the case of a stalled shutdown, use the `--allow-unsafe-shutdown` flag when
|
||||
you start Logstash.
|
||||
|
||||
[[shutdown-stall-example]]
|
||||
|
@ -587,22 +587,22 @@ Logstash startup completed
|
|||
Received shutdown signal, but pipeline is still waiting for in-flight events
|
||||
to be processed. Sending another ^C will force quit Logstash, but this may cause
|
||||
data loss. {:level=>:warn}
|
||||
{:level=>:warn, "INFLIGHT_EVENT_COUNT"=>{"input_to_filter"=>20, "total"=>20},
|
||||
{:level=>:warn, "INFLIGHT_EVENT_COUNT"=>{"input_to_filter"=>20, "total"=>20},
|
||||
"STALLING_THREADS"=>
|
||||
{["LogStash::Filters::Ruby", {"code"=>"sleep 10000"}]=>[{"thread_id"=>15,
|
||||
{["LogStash::Filters::Ruby", {"code"=>"sleep 10000"}]=>[{"thread_id"=>15,
|
||||
"name"=>"|filterworker.0", "current_call"=>"
|
||||
(ruby filter code):1:in `sleep'"}]}}
|
||||
The shutdown process appears to be stalled due to busy or blocked plugins. Check
|
||||
the logs for more information.
|
||||
The shutdown process appears to be stalled due to busy or blocked plugins. Check
|
||||
the logs for more information.
|
||||
{:level=>:error}
|
||||
{:level=>:warn, "INFLIGHT_EVENT_COUNT"=>{"input_to_filter"=>20, "total"=>20},
|
||||
{:level=>:warn, "INFLIGHT_EVENT_COUNT"=>{"input_to_filter"=>20, "total"=>20},
|
||||
"STALLING_THREADS"=>
|
||||
{["LogStash::Filters::Ruby", {"code"=>"sleep 10000"}]=>[{"thread_id"=>15,
|
||||
{["LogStash::Filters::Ruby", {"code"=>"sleep 10000"}]=>[{"thread_id"=>15,
|
||||
"name"=>"|filterworker.0", "current_call"=>"
|
||||
(ruby filter code):1:in `sleep'"}]}}
|
||||
{:level=>:warn, "INFLIGHT_EVENT_COUNT"=>{"input_to_filter"=>20, "total"=>20},
|
||||
{:level=>:warn, "INFLIGHT_EVENT_COUNT"=>{"input_to_filter"=>20, "total"=>20},
|
||||
"STALLING_THREADS"=>
|
||||
{["LogStash::Filters::Ruby", {"code"=>"sleep 10000"}]=>[{"thread_id"=>15,
|
||||
{["LogStash::Filters::Ruby", {"code"=>"sleep 10000"}]=>[{"thread_id"=>15,
|
||||
"name"=>"|filterworker.0", "current_call"=>"
|
||||
(ruby filter code):1:in `sleep'"}]}}
|
||||
Forcefully quitting logstash.. {:level=>:fatal}
|
|
@ -1,21 +1,21 @@
|
|||
[[breaking-changes]]
|
||||
== Breaking changes
|
||||
|
||||
Version 2.0 of Logstash has some changes that are incompatible with previous versions of Logstash. This section discusses
|
||||
Version 2.0 of Logstash has some changes that are incompatible with previous versions of Logstash. This section discusses
|
||||
what you need to be aware of when migrating to this version.
|
||||
|
||||
[float]
|
||||
== Elasticsearch Output Default
|
||||
|
||||
Starting with the 2.0 release of Logstash, the default Logstash output for Elasticsearch is HTTP. To use the `node` or
|
||||
`transport` protocols, download the https://www.elastic.co/guide/en/logstash/2.0/plugins-outputs-elasticsearch_java.html[Elasticsearch Java plugin]. The
|
||||
`transport` protocols, download the https://www.elastic.co/guide/en/logstash/2.0/plugins-outputs-elasticsearch_java.html[Elasticsearch Java plugin]. The
|
||||
Logstash HTTP output to Elasticsearch now supports sniffing.
|
||||
|
||||
NOTE: The `elasticsearch_java` plugin has two versions specific to the version of the underlying Elasticsearch cluster.
|
||||
NOTE: The `elasticsearch_java` plugin has two versions specific to the version of the underlying Elasticsearch cluster.
|
||||
Be sure to specify the correct value for the `--version` option during installation:
|
||||
* For Elasticsearch versions before 2.0, use the command
|
||||
* For Elasticsearch versions before 2.0, use the command
|
||||
`bin/plugin install --version 1.5.x logstash-output-elasticsearch_java`
|
||||
* For Elasticsearch versions 2.0 and after, use the command
|
||||
* For Elasticsearch versions 2.0 and after, use the command
|
||||
`bin/plugin install --version 2.0.0 logstash-output-elasticsearch_java`
|
||||
|
||||
[float]
|
||||
|
@ -23,7 +23,7 @@ Be sure to specify the correct value for the `--version` option during installat
|
|||
|
||||
The Elasticsearch output plugin configuration has the following changes:
|
||||
|
||||
* The `host` configuration option is now `hosts`, allowing you to specify multiple hosts and associated ports in the
|
||||
* The `host` configuration option is now `hosts`, allowing you to specify multiple hosts and associated ports in the
|
||||
`myhost:9200` format
|
||||
* New options: `bind_host`, `bind_port`, `cluster`, `embedded`, `embedded_http_port`, `port`, `sniffing_delay`
|
||||
* The `max_inflight_requests` option, which was deprecated in the 1.5 release, is now removed
|
||||
|
@ -42,22 +42,22 @@ Configuration files with these settings present are invalid and prevent Logstash
|
|||
=== Kafka Output Configuration Changes
|
||||
|
||||
The 2.0 release of Logstash includes a new version of the Kafka output plugin with significant configuration changes.
|
||||
Please compare the documentation pages for the
|
||||
Please compare the documentation pages for the
|
||||
https://www.elastic.co/guide/en/logstash/1.5/plugins-outputs-kafka.html[Logstash 1.5] and
|
||||
https://www.elastic.co/guide/en/logstash/2.0/plugins-outputs-kafka.html[Logstash 2.0] versions of the Kafka output plugin
|
||||
https://www.elastic.co/guide/en/logstash/2.0/plugins-outputs-kafka.html[Logstash 2.0] versions of the Kafka output plugin
|
||||
and update your configuration files accordingly.
|
||||
|
||||
[float]
|
||||
=== Metrics Filter Changes
|
||||
Prior implementations of the metrics filter plugin used dotted field names. Elasticsearch does not allow field names to
|
||||
have dots, beginning with version 2.0, so a change was made to use sub-fields instead of dots in this plugin. Please note
|
||||
Prior implementations of the metrics filter plugin used dotted field names. Elasticsearch does not allow field names to
|
||||
have dots, beginning with version 2.0, so a change was made to use sub-fields instead of dots in this plugin. Please note
|
||||
that these changes make version 3.0.0 of the metrics filter plugin incompatible with previous releases.
|
||||
|
||||
|
||||
[float]
|
||||
=== Filter Worker Default Change
|
||||
|
||||
Starting with the 2.0 release of Logstash, the default value of the `filter_workers` configuration option for filter
|
||||
plugins is half of the available CPU cores, instead of 1. This change increases parallelism in filter execution for
|
||||
resource-intensive filtering operations. You can continue to use the `-w` flag to manually set the value for this option,
|
||||
Starting with the 2.0 release of Logstash, the default value of the `filter_workers` configuration option for filter
|
||||
plugins is half of the available CPU cores, instead of 1. This change increases parallelism in filter execution for
|
||||
resource-intensive filtering operations. You can continue to use the `-w` flag to manually set the value for this option,
|
||||
as in previous releases.
|
|
@ -37,14 +37,14 @@ Logstash has the following flags. You can use the `--help` flag to display this
|
|||
and NAME is the name of the plugin.
|
||||
|
||||
-t, --configtest
|
||||
Checks configuration and then exit. Note that grok patterns are not checked for
|
||||
correctness with this flag.
|
||||
Logstash can read multiple config files from a directory. If you combine this
|
||||
Checks configuration and then exit. Note that grok patterns are not checked for
|
||||
correctness with this flag.
|
||||
Logstash can read multiple config files from a directory. If you combine this
|
||||
flag with `--debug`, Logstash will log the combined config file, annotating the
|
||||
individual config blocks with the source file it came from.
|
||||
|
||||
-h, --help
|
||||
Print help
|
||||
Print help
|
||||
|
||||
-v
|
||||
*DEPRECATED: see --verbose/debug* Increase verbosity. There are multiple levels
|
|
@ -27,10 +27,10 @@ for plugin shutdown: `stop`, `stop?`, and `close`.
|
|||
|
||||
* Call the `stop` method from outside the plugin thread. This method signals the plugin to stop.
|
||||
* The `stop?` method returns `true` when the `stop` method has already been called for that plugin.
|
||||
* The `close` method performs final bookkeeping and cleanup after the plugin's `run` method and the plugin's thread both
|
||||
* The `close` method performs final bookkeeping and cleanup after the plugin's `run` method and the plugin's thread both
|
||||
exit. The `close` method is a a new name for the method known as `teardown` in previous versions of Logstash.
|
||||
|
||||
The `shutdown`, `finished`, `finished?`, `running?`, and `terminating?` methods are redundant and no longer present in the
|
||||
The `shutdown`, `finished`, `finished?`, `running?`, and `terminating?` methods are redundant and no longer present in the
|
||||
Plugin Base class.
|
||||
|
||||
Sample code for the new plugin shutdown APIs is https://github.com/logstash-plugins/logstash-input-example/blob/master/lib/logstash/inputs/example.rb[available].
|
|
@ -1,20 +1,20 @@
|
|||
[[deploying-and-scaling]]
|
||||
=== Deploying and Scaling Logstash
|
||||
|
||||
As your use case for Logstash evolves, the preferred architecture at a given scale will change. This section discusses
|
||||
a range of Logstash architectures in increasing order of complexity, starting from a minimal installation and adding
|
||||
elements to the system. The example deployments in this section write to an Elasticsearch cluster, but Logstash can
|
||||
As your use case for Logstash evolves, the preferred architecture at a given scale will change. This section discusses
|
||||
a range of Logstash architectures in increasing order of complexity, starting from a minimal installation and adding
|
||||
elements to the system. The example deployments in this section write to an Elasticsearch cluster, but Logstash can
|
||||
write to a large variety of {logstash}output-plugins.html[endpoints].
|
||||
|
||||
[float]
|
||||
[[deploying-minimal-install]]
|
||||
==== The Minimal Installation
|
||||
|
||||
The minimal Logstash installation has one Logstash instance and one Elasticsearch instance. These instances are
|
||||
directly connected. Logstash uses an {logstash}input-plugins.html[_input plugin_] to ingest data and an
|
||||
Elasticsearch {logstash}output-plugins.html[_output plugin_] to index the data in Elasticsearch, following the Logstash
|
||||
{logstash}pipeline.html[_processing pipeline_]. A Logstash instance has a fixed pipeline constructed at startup,
|
||||
based on the instance’s configuration file. You must specify an input plugin. Output defaults to `stdout`, and the
|
||||
The minimal Logstash installation has one Logstash instance and one Elasticsearch instance. These instances are
|
||||
directly connected. Logstash uses an {logstash}input-plugins.html[_input plugin_] to ingest data and an
|
||||
Elasticsearch {logstash}output-plugins.html[_output plugin_] to index the data in Elasticsearch, following the Logstash
|
||||
{logstash}pipeline.html[_processing pipeline_]. A Logstash instance has a fixed pipeline constructed at startup,
|
||||
based on the instance’s configuration file. You must specify an input plugin. Output defaults to `stdout`, and the
|
||||
filtering section of the pipeline, which is discussed in the next section, is optional.
|
||||
|
||||
image::static/images/deploy_1.png[]
|
||||
|
@ -23,17 +23,17 @@ image::static/images/deploy_1.png[]
|
|||
[[deploying-filter-threads]]
|
||||
==== Using Filters
|
||||
|
||||
Log data is typically unstructured, often contains extraneous information that isn’t relevant to your use case, and
|
||||
sometimes is missing relevant information that can be derived from the log contents. You can use a
|
||||
{logstash}filter-plugins.html[filter plugin] to parse the log into fields, remove unnecessary information, and derive
|
||||
additional information from the existing fields. For example, filters can derive geolocation information from an IP
|
||||
address and add that information to the logs, or parse and structure arbitrary text with the
|
||||
Log data is typically unstructured, often contains extraneous information that isn’t relevant to your use case, and
|
||||
sometimes is missing relevant information that can be derived from the log contents. You can use a
|
||||
{logstash}filter-plugins.html[filter plugin] to parse the log into fields, remove unnecessary information, and derive
|
||||
additional information from the existing fields. For example, filters can derive geolocation information from an IP
|
||||
address and add that information to the logs, or parse and structure arbitrary text with the
|
||||
{logstash}plugins-filters-grok.html[grok] filter.
|
||||
|
||||
Adding a filter plugin can significantly affect performance, depending on the amount of computation the filter plugin
|
||||
performs, as well as on the volume of the logs being processed. The `grok` filter’s regular expression computation is
|
||||
particularly resource-intensive. One way to address this increased demand for computing resources is to use
|
||||
parallel processing on multicore machines. Use the `-w` switch to set the number of execution threads for Logstash
|
||||
Adding a filter plugin can significantly affect performance, depending on the amount of computation the filter plugin
|
||||
performs, as well as on the volume of the logs being processed. The `grok` filter’s regular expression computation is
|
||||
particularly resource-intensive. One way to address this increased demand for computing resources is to use
|
||||
parallel processing on multicore machines. Use the `-w` switch to set the number of execution threads for Logstash
|
||||
filtering tasks. For example the `bin/logstash -w 8` command uses eight different threads for filter processing.
|
||||
|
||||
image::static/images/deploy_2.png[]
|
||||
|
@ -43,9 +43,9 @@ image::static/images/deploy_2.png[]
|
|||
==== Using Filebeat
|
||||
|
||||
https://www.elastic.co/guide/en/beats/filebeat/current/index.html[Filebeat] is a lightweight, resource-friendly tool
|
||||
written in Go that collects logs from files on the server and forwards these logs to other machines for processing.
|
||||
Filebeat uses the https://www.elastic.co/guide/en/beats/libbeat/current/index.html[Beats] protocol to communicate with a
|
||||
centralized Logstash instance. Configure the Logstash instances that receive Beats data to use the
|
||||
written in Go that collects logs from files on the server and forwards these logs to other machines for processing.
|
||||
Filebeat uses the https://www.elastic.co/guide/en/beats/libbeat/current/index.html[Beats] protocol to communicate with a
|
||||
centralized Logstash instance. Configure the Logstash instances that receive Beats data to use the
|
||||
{logstash}plugins-inputs-beats.html[Beats input plugin].
|
||||
|
||||
Filebeat uses the computing resources of the machine hosting the source data, and the Beats input plugin minimizes the
|
||||
|
@ -57,33 +57,33 @@ image::static/images/deploy_3.png[]
|
|||
[[deploying-larger-cluster]]
|
||||
==== Scaling to a Larger Elasticsearch Cluster
|
||||
|
||||
Typically, Logstash does not communicate with a single Elasticsearch node, but with a cluster that comprises several
|
||||
Typically, Logstash does not communicate with a single Elasticsearch node, but with a cluster that comprises several
|
||||
nodes. By default, Logstash uses the HTTP protocol to move data into the cluster.
|
||||
|
||||
You can use the Elasticsearch HTTP REST APIs to index data into the Elasticsearch cluster. These APIs represent the
|
||||
indexed data in JSON. Using the REST APIs does not require the Java client classes or any additional JAR
|
||||
files and has no performance disadvantages compared to the transport or node protocols. You can secure communications
|
||||
You can use the Elasticsearch HTTP REST APIs to index data into the Elasticsearch cluster. These APIs represent the
|
||||
indexed data in JSON. Using the REST APIs does not require the Java client classes or any additional JAR
|
||||
files and has no performance disadvantages compared to the transport or node protocols. You can secure communications
|
||||
that use the HTTP REST APIs with the {shield}[Shield] plugin, which supports SSL and HTTP basic authentication.
|
||||
|
||||
When you use the HTTP protocol, you can configure the Logstash Elasticsearch output plugin to automatically
|
||||
load-balance indexing requests across a
|
||||
When you use the HTTP protocol, you can configure the Logstash Elasticsearch output plugin to automatically
|
||||
load-balance indexing requests across a
|
||||
specified set of hosts in the Elasticsearch cluster. Specifying multiple Elasticsearch nodes also provides high availability for the Elasticsearch cluster by routing traffic to active Elasticsearch nodes.
|
||||
|
||||
You can also use the Elasticsearch Java APIs to serialize the data into a binary representation, using
|
||||
the transport protocol. The transport protocol can sniff the endpoint of the request and select an
|
||||
arbitrary client or data node in the Elasticsearch cluster.
|
||||
You can also use the Elasticsearch Java APIs to serialize the data into a binary representation, using
|
||||
the transport protocol. The transport protocol can sniff the endpoint of the request and select an
|
||||
arbitrary client or data node in the Elasticsearch cluster.
|
||||
|
||||
Using the HTTP or transport protocols keep your Logstash instances separate from the Elasticsearch cluster. The node
|
||||
protocol, by contrast, has the machine running the Logstash instance join the Elasticsearch cluster, running an
|
||||
Elasticsearch instance. The data that needs indexing propagates from this node to the rest of the cluster. Since the
|
||||
machine is part of the cluster, the cluster topology is available, making the node protocol a good fit for use cases
|
||||
Using the HTTP or transport protocols keep your Logstash instances separate from the Elasticsearch cluster. The node
|
||||
protocol, by contrast, has the machine running the Logstash instance join the Elasticsearch cluster, running an
|
||||
Elasticsearch instance. The data that needs indexing propagates from this node to the rest of the cluster. Since the
|
||||
machine is part of the cluster, the cluster topology is available, making the node protocol a good fit for use cases
|
||||
that use a relatively small number of persistent connections.
|
||||
|
||||
You can also use a third-party hardware or software load balancer to handle connections between Logstash and
|
||||
You can also use a third-party hardware or software load balancer to handle connections between Logstash and
|
||||
external applications.
|
||||
|
||||
NOTE: Make sure that your Logstash configuration does not connect directly to Elasticsearch dedicated
|
||||
{ref}modules-node.html[master nodes], which perform dedicated cluster management. Connect Logstash to client or data
|
||||
{ref}modules-node.html[master nodes], which perform dedicated cluster management. Connect Logstash to client or data
|
||||
nodes to protect the stability of your Elasticsearch cluster.
|
||||
|
||||
image::static/images/deploy_4.png[]
|
||||
|
@ -92,19 +92,19 @@ image::static/images/deploy_4.png[]
|
|||
[[deploying-message-queueing]]
|
||||
==== Managing Throughput Spikes with Message Queueing
|
||||
|
||||
When the data coming into a Logstash pipeline exceeds the Elasticsearch cluster's ability to ingest the data, you can
|
||||
use a message queue as a buffer. By default, Logstash throttles incoming events when
|
||||
indexer consumption rates fall below incoming data rates. Since this throttling can lead to events being buffered at
|
||||
When the data coming into a Logstash pipeline exceeds the Elasticsearch cluster's ability to ingest the data, you can
|
||||
use a message queue as a buffer. By default, Logstash throttles incoming events when
|
||||
indexer consumption rates fall below incoming data rates. Since this throttling can lead to events being buffered at
|
||||
the data source, preventing backpressure with message queues becomes an important part of managing your deployment.
|
||||
|
||||
Adding a message queue to your Logstash deployment also provides a level of protection from data loss. When a Logstash
|
||||
instance that has consumed data from the message queue fails, the data can be replayed from the message queue to an
|
||||
Adding a message queue to your Logstash deployment also provides a level of protection from data loss. When a Logstash
|
||||
instance that has consumed data from the message queue fails, the data can be replayed from the message queue to an
|
||||
active Logstash instance.
|
||||
|
||||
Several third-party message queues exist, such as Redis, Kafka, or RabbitMQ. Logstash provides input and output plugins
|
||||
to integrate with several of these third-party message queues. When your Logstash deployment has a message queue
|
||||
configured, Logstash functionally exists in two phases: shipping instances, which handles data ingestion and storage in
|
||||
the message queue, and indexing instances, which retrieve the data from the message queue, apply any configured
|
||||
Several third-party message queues exist, such as Redis, Kafka, or RabbitMQ. Logstash provides input and output plugins
|
||||
to integrate with several of these third-party message queues. When your Logstash deployment has a message queue
|
||||
configured, Logstash functionally exists in two phases: shipping instances, which handles data ingestion and storage in
|
||||
the message queue, and indexing instances, which retrieve the data from the message queue, apply any configured
|
||||
filtering, and write the filtered data to an Elasticsearch index.
|
||||
|
||||
image::static/images/deploy_5.png[]
|
||||
|
@ -113,20 +113,20 @@ image::static/images/deploy_5.png[]
|
|||
[[deploying-logstash-ha]]
|
||||
==== Multiple Connections for Logstash High Availability
|
||||
|
||||
To make your Logstash deployment more resilient to individual instance failures, you can set up a load balancer between
|
||||
your data source machines and the Logstash cluster. The load balancer handles the individual connections to the
|
||||
To make your Logstash deployment more resilient to individual instance failures, you can set up a load balancer between
|
||||
your data source machines and the Logstash cluster. The load balancer handles the individual connections to the
|
||||
Logstash instances to ensure continuity of data ingestion and processing even when an individual instance is unavailable.
|
||||
|
||||
image::static/images/deploy_6.png[]
|
||||
|
||||
The architecture in the previous diagram is unable to process input from a specific type, such as an RSS feed or a
|
||||
file, if the Logstash instance dedicated to that input type becomes unavailable. For more robust input processing,
|
||||
The architecture in the previous diagram is unable to process input from a specific type, such as an RSS feed or a
|
||||
file, if the Logstash instance dedicated to that input type becomes unavailable. For more robust input processing,
|
||||
configure each Logstash instance for multiple inputs, as in the following diagram:
|
||||
|
||||
image::static/images/deploy_7.png[]
|
||||
|
||||
This architecture parallelizes the Logstash workload based on the inputs you configure. With more inputs, you can add
|
||||
more Logstash instances to scale horizontally. Separate parallel pipelines also increases the reliability of your stack
|
||||
This architecture parallelizes the Logstash workload based on the inputs you configure. With more inputs, you can add
|
||||
more Logstash instances to scale horizontally. Separate parallel pipelines also increases the reliability of your stack
|
||||
by eliminating single points of failure.
|
||||
|
||||
[float]
|
||||
|
@ -140,7 +140,7 @@ A mature Logstash deployment typically has the following pipeline:
|
|||
* The _filter_ tier applies parsing and other processing to the data consumed from the message queue.
|
||||
* The _indexing_ tier moves the processed data into Elasticsearch.
|
||||
|
||||
Any of these layers can be scaled by adding computing resources. Examine the performance of these components regularly
|
||||
as your use case evolves and add resources as needed. When Logstash routinely throttles incoming events, consider
|
||||
adding storage for your message queue. Alternately, increase the Elasticsearch cluster's rate of data consumption by
|
||||
Any of these layers can be scaled by adding computing resources. Examine the performance of these components regularly
|
||||
as your use case evolves and add resources as needed. When Logstash routinely throttles incoming events, consider
|
||||
adding storage for your message queue. Alternately, increase the Elasticsearch cluster's rate of data consumption by
|
||||
adding more Logstash indexing instances.
|
|
@ -1,15 +1,15 @@
|
|||
[[getting-started-with-logstash]]
|
||||
== Getting Started with Logstash
|
||||
|
||||
This section guides you through the process of installing Logstash and verifying that everything is running properly.
|
||||
This section guides you through the process of installing Logstash and verifying that everything is running properly.
|
||||
Later sections deal with increasingly complex configurations to address selected use cases.
|
||||
|
||||
[float]
|
||||
[[installing-logstash]]
|
||||
=== Install Logstash
|
||||
|
||||
NOTE: Logstash requires Java 7 or later. Use the
|
||||
http://www.oracle.com/technetwork/java/javase/downloads/index.html[official Oracle distribution] or an open-source
|
||||
NOTE: Logstash requires Java 7 or later. Use the
|
||||
http://www.oracle.com/technetwork/java/javase/downloads/index.html[official Oracle distribution] or an open-source
|
||||
distribution such as http://openjdk.java.net/[OpenJDK].
|
||||
|
||||
To check your Java version, run the following command:
|
||||
|
@ -28,8 +28,8 @@ Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
|
|||
[[installing-binary]]
|
||||
==== Installing from a downloaded binary
|
||||
|
||||
Download the https://www.elastic.co/downloads/logstash[Logstash installation file] that matches your host environment.
|
||||
Unpack the file. On supported Linux operating systems, you can <<package-repositories,use a package manager>> to
|
||||
Download the https://www.elastic.co/downloads/logstash[Logstash installation file] that matches your host environment.
|
||||
Unpack the file. On supported Linux operating systems, you can <<package-repositories,use a package manager>> to
|
||||
install Logstash.
|
||||
|
||||
[[first-event]]
|
||||
|
@ -41,17 +41,17 @@ To test your Logstash installation, run the most basic Logstash pipeline:
|
|||
cd logstash-{logstash_version}
|
||||
bin/logstash -e 'input { stdin { } } output { stdout {} }'
|
||||
|
||||
The `-e` flag enables you to specify a configuration directly from the command line. Specifying configurations at the
|
||||
The `-e` flag enables you to specify a configuration directly from the command line. Specifying configurations at the
|
||||
command line lets you quickly test configurations without having to edit a file between iterations.
|
||||
This pipeline takes input from the standard input, `stdin`, and moves that input to the standard output, `stdout`, in a
|
||||
This pipeline takes input from the standard input, `stdin`, and moves that input to the standard output, `stdout`, in a
|
||||
structured format. Type hello world at the command prompt to see Logstash respond:
|
||||
|
||||
[source,shell]
|
||||
hello world
|
||||
2013-11-21T01:22:14.405+0000 0.0.0.0 hello world
|
||||
|
||||
Logstash adds timestamp and IP address information to the message. Exit Logstash by issuing a *CTRL-D* command in the
|
||||
Logstash adds timestamp and IP address information to the message. Exit Logstash by issuing a *CTRL-D* command in the
|
||||
shell where Logstash is running.
|
||||
|
||||
The <<advanced-pipeline,Advanced Tutorial>> expands the capabilities of your Logstash instance to cover broader
|
||||
The <<advanced-pipeline,Advanced Tutorial>> expands the capabilities of your Logstash instance to cover broader
|
||||
use cases.
|
Before Width: | Height: | Size: 77 KiB After Width: | Height: | Size: 77 KiB |
Before Width: | Height: | Size: 33 KiB After Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 45 KiB |
Before Width: | Height: | Size: 55 KiB After Width: | Height: | Size: 55 KiB |
Before Width: | Height: | Size: 56 KiB After Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 105 KiB After Width: | Height: | Size: 105 KiB |
Before Width: | Height: | Size: 164 KiB After Width: | Height: | Size: 164 KiB |
Before Width: | Height: | Size: 172 KiB After Width: | Height: | Size: 172 KiB |
Before Width: | Height: | Size: 68 KiB After Width: | Height: | Size: 68 KiB |
|
@ -27,13 +27,13 @@ Collect more, so you can know more. Logstash welcomes data of all shapes and siz
|
|||
Where it all started.
|
||||
|
||||
* Handle all types of logging data
|
||||
** Easily ingest a multitude of web logs like <<parsing-into-es,Apache>>, and application
|
||||
** Easily ingest a multitude of web logs like <<parsing-into-es,Apache>>, and application
|
||||
logs like <<plugins-inputs-log4j,log4j>> for Java
|
||||
** Capture many other log formats like <<plugins-inputs-syslog,syslog>>,
|
||||
** Capture many other log formats like <<plugins-inputs-syslog,syslog>>,
|
||||
<<plugins-inputs-eventlog,Windows event logs>>, networking and firewall logs, and more
|
||||
* Enjoy complementary secure log forwarding capabilities with https://github.com/elastic/beats/tree/master/filebeat[Filebeat]
|
||||
* Collect metrics from <<plugins-inputs-ganglia,Ganglia>>, <<plugins-codecs-collectd,collectd>>,
|
||||
<<plugins-codecs-netflow,NetFlow>>, <<plugins-inputs-jmx,JMX>>, and many other infrastructure
|
||||
* Collect metrics from <<plugins-inputs-ganglia,Ganglia>>, <<plugins-codecs-collectd,collectd>>,
|
||||
<<plugins-codecs-netflow,NetFlow>>, <<plugins-inputs-jmx,JMX>>, and many other infrastructure
|
||||
and application platforms over <<plugins-inputs-tcp,TCP>> and <<plugins-inputs-udp,UDP>>
|
||||
|
||||
[float]
|
||||
|
@ -41,12 +41,12 @@ and application platforms over <<plugins-inputs-tcp,TCP>> and <<plugins-inputs-u
|
|||
|
||||
Unlock the World Wide Web.
|
||||
|
||||
* Transform <<plugins-inputs-http,HTTP requests>> into events
|
||||
* Transform <<plugins-inputs-http,HTTP requests>> into events
|
||||
(https://www.elastic.co/blog/introducing-logstash-input-http-plugin[blog])
|
||||
** Consume from web service firehoses like <<plugins-inputs-twitter,Twitter>> for social sentiment analysis
|
||||
** Webhook support for GitHub, HipChat, JIRA, and countless other applications
|
||||
** Enables many https://www.elastic.co/guide/en/watcher/current/logstash-integration.html[Watcher] alerting use cases
|
||||
* Create events by polling <<plugins-inputs-http_poller,HTTP endpoints>> on demand
|
||||
* Create events by polling <<plugins-inputs-http_poller,HTTP endpoints>> on demand
|
||||
(https://www.elastic.co/blog/introducing-logstash-http-poller[blog])
|
||||
** Universally capture health, performance, metrics, and other types of data from web application interfaces
|
||||
** Perfect for scenarios where the control of polling is preferred over receiving
|
||||
|
@ -56,10 +56,10 @@ Unlock the World Wide Web.
|
|||
|
||||
Discover more value from the data you already own.
|
||||
|
||||
* Better understand your data from any relational database or NoSQL store with a
|
||||
* Better understand your data from any relational database or NoSQL store with a
|
||||
<<plugins-inputs-jdbc,JDBC>> interface (https://www.elastic.co/blog/logstash-jdbc-input-plugin[blog])
|
||||
* Unify diverse data streams from messaging queues like Apache <<plugins-outputs-kafka,Kafka>>
|
||||
(https://www.elastic.co/blog/logstash-kafka-intro[blog]), <<plugins-outputs-rabbitmq,RabbitMQ>>,
|
||||
* Unify diverse data streams from messaging queues like Apache <<plugins-outputs-kafka,Kafka>>
|
||||
(https://www.elastic.co/blog/logstash-kafka-intro[blog]), <<plugins-outputs-rabbitmq,RabbitMQ>>,
|
||||
<<plugins-outputs-sqs,Amazon SQS>>, and <<plugins-outputs-zeromq,ZeroMQ>>
|
||||
|
||||
[float]
|
||||
|
@ -67,41 +67,41 @@ Discover more value from the data you already own.
|
|||
|
||||
Explore an expansive breadth of other data.
|
||||
|
||||
* In this age of technological advancement, the massive IoT world unleashes endless use cases through capturing and
|
||||
* In this age of technological advancement, the massive IoT world unleashes endless use cases through capturing and
|
||||
harnessing data from connected sensors.
|
||||
* Logstash is the common event collection backbone for ingestion of data shipped from mobile devices to intelligent
|
||||
* Logstash is the common event collection backbone for ingestion of data shipped from mobile devices to intelligent
|
||||
homes, connected vehicles, healthcare sensors, and many other industry specific applications.
|
||||
* https://www.elastic.co/elasticon/2015/sf/if-it-moves-measure-it-logging-iot-with-elk[Watch] as Logstash, in
|
||||
conjunction with the broader ELK stack, centralizes and enriches sensor data to gain deeper knowledge regarding a
|
||||
* https://www.elastic.co/elasticon/2015/sf/if-it-moves-measure-it-logging-iot-with-elk[Watch] as Logstash, in
|
||||
conjunction with the broader ELK stack, centralizes and enriches sensor data to gain deeper knowledge regarding a
|
||||
residential home.
|
||||
|
||||
[float]
|
||||
== Easily Enrich Everything
|
||||
|
||||
The better the data, the better the knowledge. Clean and transform your data during ingestion to gain near real-time
|
||||
insights immediately at index or output time. Logstash comes out-of-box with many aggregations and mutations along
|
||||
The better the data, the better the knowledge. Clean and transform your data during ingestion to gain near real-time
|
||||
insights immediately at index or output time. Logstash comes out-of-box with many aggregations and mutations along
|
||||
with pattern matching, geo mapping, and dynamic lookup capabilities.
|
||||
|
||||
* <<plugins-filters-grok,Grok>> is the bread and butter of Logstash filters and is used ubiquitously to derive
|
||||
structure out of unstructured data. Enjoy a wealth of integrated patterns aimed to help quickly resolve web, systems,
|
||||
* <<plugins-filters-grok,Grok>> is the bread and butter of Logstash filters and is used ubiquitously to derive
|
||||
structure out of unstructured data. Enjoy a wealth of integrated patterns aimed to help quickly resolve web, systems,
|
||||
networking, and other types of event formats.
|
||||
* Expand your horizons by deciphering <<plugins-filters-geoip,geo coordinates>> from IP addresses, normalizing
|
||||
<<plugins-filters-date,date>> complexity, simplifying <<plugins-filters-kv,key-value pairs>> and
|
||||
<<plugins-filters-csv,CSV>> data, <<plugins-filters-anonymize,anonymizing>> sensitive information, and further
|
||||
enriching your data with <<plugins-filters-translate,local lookups>> or Elasticsearch
|
||||
* Expand your horizons by deciphering <<plugins-filters-geoip,geo coordinates>> from IP addresses, normalizing
|
||||
<<plugins-filters-date,date>> complexity, simplifying <<plugins-filters-kv,key-value pairs>> and
|
||||
<<plugins-filters-csv,CSV>> data, <<plugins-filters-anonymize,anonymizing>> sensitive information, and further
|
||||
enriching your data with <<plugins-filters-translate,local lookups>> or Elasticsearch
|
||||
<<plugins-filters-elasticsearch,queries>>.
|
||||
* Codecs are often used to ease the processing of common event structures like <<plugins-codecs-json,JSON>>
|
||||
* Codecs are often used to ease the processing of common event structures like <<plugins-codecs-json,JSON>>
|
||||
and <<plugins-codecs-multiline,multiline>> events.
|
||||
|
||||
[float]
|
||||
== Choose Your Stash
|
||||
|
||||
Route your data where it matters most. Unlock various downstream analytical and operational use cases by storing,
|
||||
Route your data where it matters most. Unlock various downstream analytical and operational use cases by storing,
|
||||
analyzing, and taking action on your data.
|
||||
|
||||
[cols="a,a"]
|
||||
|=======================================================================
|
||||
|
|
||||
|
|
||||
|
||||
*Analysis*
|
||||
|
||||
|
@ -116,9 +116,9 @@ analyzing, and taking action on your data.
|
|||
* <<plugins-outputs-s3,S3>>
|
||||
* <<plugins-outputs-google_cloud_storage,Google Cloud Storage>>
|
||||
|
||||
|
|
||||
|
|
||||
|
||||
*Monitoring*
|
||||
*Monitoring*
|
||||
|
||||
* <<plugins-outputs-nagios,Nagios>>
|
||||
* <<plugins-outputs-ganglia,Ganglia>>
|
||||
|
@ -127,7 +127,7 @@ analyzing, and taking action on your data.
|
|||
* <<plugins-outputs-datadog,Datadog>>
|
||||
* <<plugins-outputs-cloudwatch,CloudWatch>>
|
||||
|
||||
|
|
||||
|
|
||||
|
||||
*Alerting*
|
||||
|
|
@ -1,4 +1,4 @@
|
|||
== Glossary
|
||||
== Glossary
|
||||
Logstash Glossary
|
||||
|
||||
apache ::
|
||||
|
@ -9,7 +9,7 @@ agent ::
|
|||
|
||||
|
||||
broker ::
|
||||
An intermediary used in a multi-tiered Logstash deployment which allows a queueing mechanism to be used. Examples of brokers are Redis, RabbitMQ, and Apache Kafka. This pattern is a common method of building fault-tolerance into a Logstash architecture.
|
||||
An intermediary used in a multi-tiered Logstash deployment which allows a queueing mechanism to be used. Examples of brokers are Redis, RabbitMQ, and Apache Kafka. This pattern is a common method of building fault-tolerance into a Logstash architecture.
|
||||
|
||||
buffer::
|
||||
Within Logstash, a temporary storage area where events can queue up, waiting to be processed. The default queue size is 20 events, but it is not recommended to increase this, as Logstash is not designed to operate as a queueing mechanism.
|
||||
|
@ -27,7 +27,7 @@ conditional::
|
|||
In a computer programming context, a control flow which executes certain actions based on true/false values of a statement (called the condition). Often expressed in the form of "if ... then ... (elseif ...) else". Logstash has built-in conditionals to allow users control of the plugin pipeline.
|
||||
|
||||
elasticsearch::
|
||||
An open-source, Lucene-based, RESTful search and analytics engine written in Java, with supported clients in various languages such as Perl, Python, Ruby, Java, etc.
|
||||
An open-source, Lucene-based, RESTful search and analytics engine written in Java, with supported clients in various languages such as Perl, Python, Ruby, Java, etc.
|
||||
|
||||
event::
|
||||
In Logstash parlance, a single unit of information, containing a timestamp plus additional data. An event arrives via an input, and is subsequently parsed, timestamped, and passed through the Logstash pipeline.
|
||||
|
@ -39,7 +39,7 @@ file::
|
|||
A resource storing binary data (which might be text, image, application, etc.) on a physical storage media. In the Logstash context, a common input source which monitors a growing collection of text-based log lines.
|
||||
|
||||
filter:
|
||||
An intermediary processing mechanism in the Lostash pipeline. Typically, filters act upon event data after it has been ingested via inputs, by mutating, enriching, and/or modifying the data according to configuration rules. The second phase of the typical Logstash pipeline (inputs->filters->outputs).
|
||||
An intermediary processing mechanism in the Lostash pipeline. Typically, filters act upon event data after it has been ingested via inputs, by mutating, enriching, and/or modifying the data according to configuration rules. The second phase of the typical Logstash pipeline (inputs->filters->outputs).
|
||||
|
||||
fluentd::
|
||||
Like Logstash, another open-source tool for collecting logs and events, with plugins to extend functionality.
|
||||
|
@ -60,7 +60,7 @@ indexer::
|
|||
Refers to a Logstash instance which is tasked with interfacing with an Elasticsearch cluster in order to index event data.
|
||||
|
||||
input::
|
||||
The means for ingesting data into Logstash. Inputs allow users to pull data from files, network sockets, other applications, etc. The initial phase of the typical Logstash pipeline (inputs->filters->outputs).
|
||||
The means for ingesting data into Logstash. Inputs allow users to pull data from files, network sockets, other applications, etc. The initial phase of the typical Logstash pipeline (inputs->filters->outputs).
|
||||
|
||||
jar / jarfile::
|
||||
A packaging method for Java libraries. Since Logstash runs on the JRuby runtime environment, it is possible to use these Java libraries to provide extra functionality to Logstash.
|
||||
|
@ -69,7 +69,7 @@ java::
|
|||
An object-oriented programming language popular for its flexibility, extendability and portability.
|
||||
|
||||
jRuby:
|
||||
JRuby is a 100% Java implementation of the Ruby programming language, which allows Ruby to run in the JVM. Logstash typically runs in JRuby, which provides it with a fast, extensible runtime environment.
|
||||
JRuby is a 100% Java implementation of the Ruby programming language, which allows Ruby to run in the JVM. Logstash typically runs in JRuby, which provides it with a fast, extensible runtime environment.
|
||||
|
||||
kibana::
|
||||
A visual tool for viewing time-based data which has been stored in Elasticsearch. Kibana features a powerful set of functionality based on panels which query Elasticsearch in different ways.
|
||||
|
@ -87,7 +87,7 @@ lumberjack::
|
|||
A protocol for shipping logs from one location to another, in a secure and optimized manner. Also the (deprecated) name of a software application, now known as Logstash Forwarder (LSF).
|
||||
|
||||
output::
|
||||
The means for passing event data out of Logstash into other applications, network endpoints, files, etc. The last phase of the typical Logstash pipeline (inputs->filters->outputs).
|
||||
The means for passing event data out of Logstash into other applications, network endpoints, files, etc. The last phase of the typical Logstash pipeline (inputs->filters->outputs).
|
||||
|
||||
pipeline::
|
||||
A term used to describe the flow of events through the Logstash workflow. The pipeline typically consists of a series of inputs, filters, and outputs.
|
||||
|
@ -129,4 +129,4 @@ type::
|
|||
In Elasticsearch type, a type can be compared to a table in a relational database. Each type has a list of fields that can be specified for documents of that type. The mapping defines how each field in the document is analyzed. To index documents, it is required to specify both an index and a type.
|
||||
|
||||
worker::
|
||||
The filter thread model used by Logstash, where each worker receives an event and applies all filters, in order, before emitting the event to the output queue. This allows scalability across CPUs because many filters are CPU intensive (permitting that we have thread safety).
|
||||
The filter thread model used by Logstash, where each worker receives an event and applies all filters, in order, before emitting the event to the output queue. This allows scalability across CPUs because many filters are CPU intensive (permitting that we have thread safety).
|
|
@ -1,13 +1,13 @@
|
|||
[[community-maintainer]]
|
||||
== Logstash Plugins Community Maintainer Guide
|
||||
|
||||
This document, to be read by new Maintainers, should explain their responsibilities. It was inspired by the
|
||||
http://rfc.zeromq.org/spec:22[C4] document from the ZeroMQ project. This document is subject to change and suggestions
|
||||
This document, to be read by new Maintainers, should explain their responsibilities. It was inspired by the
|
||||
http://rfc.zeromq.org/spec:22[C4] document from the ZeroMQ project. This document is subject to change and suggestions
|
||||
through Pull Requests and issues are strongly encouraged.
|
||||
|
||||
=== Contribution Guidelines
|
||||
|
||||
For general guidance around contributing to Logstash Plugins, see the
|
||||
For general guidance around contributing to Logstash Plugins, see the
|
||||
https://www.elastic.co/guide/en/logstash/current/contributing-to-logstash.html[_Contributing to Logstash_] section.
|
||||
|
||||
=== Document Goals
|
||||
|
@ -16,58 +16,58 @@ To help make the Logstash plugins community participation easy with positive fe
|
|||
|
||||
To increase diversity.
|
||||
|
||||
To reduce code review, merge and release dependencies on the core team by providing support and tools to the Community and
|
||||
To reduce code review, merge and release dependencies on the core team by providing support and tools to the Community and
|
||||
Maintainers.
|
||||
|
||||
To support the natural life cycle of a plugin.
|
||||
|
||||
To codify the roles and responsibilities of: Maintainers and Contributors with specific focus on patch testing, code
|
||||
To codify the roles and responsibilities of: Maintainers and Contributors with specific focus on patch testing, code
|
||||
review, merging and release.
|
||||
|
||||
=== Development Workflow
|
||||
|
||||
All Issues and Pull Requests must be tracked using the Github issue tracker.
|
||||
|
||||
The plugin uses the http://www.apache.org/licenses/LICENSE-2.0[Apache 2.0 license]. Maintainers should check whether a
|
||||
patch introduces code which has an incompatible license. Patch ownership and copyright is defined in the Elastic
|
||||
The plugin uses the http://www.apache.org/licenses/LICENSE-2.0[Apache 2.0 license]. Maintainers should check whether a
|
||||
patch introduces code which has an incompatible license. Patch ownership and copyright is defined in the Elastic
|
||||
https://www.elastic.co/contributor-agreement[Contributor License Agreement] (CLA).
|
||||
|
||||
==== Terminology
|
||||
|
||||
A "Contributor" is a role a person assumes when providing a patch. Contributors will not have commit access to the
|
||||
repository. They need to sign the Elastic https://www.elastic.co/contributor-agreement[Contributor License Agreement]
|
||||
A "Contributor" is a role a person assumes when providing a patch. Contributors will not have commit access to the
|
||||
repository. They need to sign the Elastic https://www.elastic.co/contributor-agreement[Contributor License Agreement]
|
||||
before a patch can be reviewed. Contributors can add themselves to the plugin Contributor list.
|
||||
|
||||
A "Maintainer" is a role a person assumes when maintaining a plugin and keeping it healthy, including triaging issues, and
|
||||
A "Maintainer" is a role a person assumes when maintaining a plugin and keeping it healthy, including triaging issues, and
|
||||
reviewing and merging patches.
|
||||
|
||||
==== Patch Requirements
|
||||
|
||||
A patch is a minimal and accurate answer to exactly one identified and agreed upon problem. It must conform to the code
|
||||
A patch is a minimal and accurate answer to exactly one identified and agreed upon problem. It must conform to the code
|
||||
style guidelines and must include RSpec tests that verify the fitness of the solution.
|
||||
|
||||
A patch will be automatically tested by a CI system that will report on the Pull Request status.
|
||||
|
||||
A patch CLA will be automatically verified and reported on the Pull Request status.
|
||||
|
||||
A patch commit message has a single short (less than 50 character) first line summarizing the change, a blank second line,
|
||||
A patch commit message has a single short (less than 50 character) first line summarizing the change, a blank second line,
|
||||
and any additional lines as necessary for change explanation and rationale.
|
||||
|
||||
A patch is mergeable when it satisfies the above requirements and has been reviewed positively by at least one other
|
||||
A patch is mergeable when it satisfies the above requirements and has been reviewed positively by at least one other
|
||||
person.
|
||||
|
||||
==== Development Process
|
||||
|
||||
A user will log an issue on the issue tracker describing the problem they face or observe with as much detail as possible.
|
||||
|
||||
To work on an issue, a Contributor forks the plugin repository and then works on their forked repository and submits a
|
||||
To work on an issue, a Contributor forks the plugin repository and then works on their forked repository and submits a
|
||||
patch by creating a pull request back to the plugin.
|
||||
|
||||
Maintainers must not merge patches where the author has not signed the CLA.
|
||||
Maintainers must not merge patches where the author has not signed the CLA.
|
||||
|
||||
Before a patch can be accepted it should be reviewed. Maintainers should merge accepted patches without delay.
|
||||
|
||||
Maintainers should not merge their own patches except in exceptional cases, such as non-responsiveness from other
|
||||
Maintainers should not merge their own patches except in exceptional cases, such as non-responsiveness from other
|
||||
Maintainers or core team for an extended period (more than 2 weeks).
|
||||
|
||||
Reviewer’s comments should not be based on personal preferences.
|
||||
|
@ -80,42 +80,42 @@ Review non-source changes such as documentation in the same way as source code c
|
|||
|
||||
==== Branch Management
|
||||
|
||||
The plugin has a master branch that always holds the latest in-progress version and should always build. Topic branches
|
||||
The plugin has a master branch that always holds the latest in-progress version and should always build. Topic branches
|
||||
should kept to the minimum.
|
||||
|
||||
=== Versioning Plugins
|
||||
|
||||
Logstash core and its plugins have separate product development lifecycles. Hence the versioning and release strategy for
|
||||
the core and plugins do not have to be aligned. In fact, this was one of our goals during the great separation of plugins
|
||||
work in Logstash 1.5.
|
||||
Logstash core and its plugins have separate product development lifecycles. Hence the versioning and release strategy for
|
||||
the core and plugins do not have to be aligned. In fact, this was one of our goals during the great separation of plugins
|
||||
work in Logstash 1.5.
|
||||
|
||||
At times, there will be changes in core API in Logstash, which will require mass update of plugins to reflect the changes
|
||||
in core. However, this does not happen frequently.
|
||||
At times, there will be changes in core API in Logstash, which will require mass update of plugins to reflect the changes
|
||||
in core. However, this does not happen frequently.
|
||||
|
||||
For plugins, we would like to adhere to a versioning and release strategy that can better inform our users, about any
|
||||
For plugins, we would like to adhere to a versioning and release strategy that can better inform our users, about any
|
||||
breaking changes to the Logstash configuration formats and functionality.
|
||||
|
||||
Plugin releases follows a three-placed numbering scheme X.Y.Z. where X denotes a major release version which may break
|
||||
compatibility with existing configuration or functionality. Y denotes releases which includes features which are backward
|
||||
compatible. Z denotes releases which includes bug fixes and patches.
|
||||
Plugin releases follows a three-placed numbering scheme X.Y.Z. where X denotes a major release version which may break
|
||||
compatibility with existing configuration or functionality. Y denotes releases which includes features which are backward
|
||||
compatible. Z denotes releases which includes bug fixes and patches.
|
||||
|
||||
==== Changing the version
|
||||
|
||||
Version can be changed in the Gemspec, which needs to be associated with a changelog entry. Following this, we can publish
|
||||
Version can be changed in the Gemspec, which needs to be associated with a changelog entry. Following this, we can publish
|
||||
the gem to RubyGem.org manually. At this point only the core developers can publish a gem.
|
||||
|
||||
==== Labeling
|
||||
|
||||
Labeling is a critical aspect of maintaining plugins. All issues in GitHub should be labeled correctly so it can:
|
||||
Labeling is a critical aspect of maintaining plugins. All issues in GitHub should be labeled correctly so it can:
|
||||
|
||||
* Provide good feedback to users/developers
|
||||
* Help prioritize changes
|
||||
* Provide good feedback to users/developers
|
||||
* Help prioritize changes
|
||||
* Be used in release notes
|
||||
|
||||
Most labels are self explanatory, but here’s a quick recap of few important labels:
|
||||
|
||||
* `bug`: Labels an issue as an unintentional defect
|
||||
* `needs details`: If a the issue reporter has incomplete details, please ask them for more info and label as needs
|
||||
* `needs details`: If a the issue reporter has incomplete details, please ask them for more info and label as needs
|
||||
details.
|
||||
* `missing cla`: Contributor License Agreement is missing and patch cannot be accepted without it
|
||||
* `adopt me`: Ask for help from the community to take over this issue
|
||||
|
@ -125,8 +125,8 @@ details.
|
|||
|
||||
=== Logging
|
||||
|
||||
Although it’s important not to bog down performance with excessive logging, debug level logs can be immensely helpful when
|
||||
diagnosing and troubleshooting issues with Logstash. Please remember to liberally add debug logs wherever it makes sense
|
||||
Although it’s important not to bog down performance with excessive logging, debug level logs can be immensely helpful when
|
||||
diagnosing and troubleshooting issues with Logstash. Please remember to liberally add debug logs wherever it makes sense
|
||||
as users will be forever gracious.
|
||||
|
||||
[source,shell]
|
||||
|
@ -136,13 +136,13 @@ as users will be forever gracious.
|
|||
|
||||
[qanda]
|
||||
Why is a https://www.elastic.co/contributor-agreement[CLA] required?::
|
||||
We ask this of all Contributors in order to assure our users of the origin and continuing existence of the code. We
|
||||
are not asking Contributors to assign copyright to us, but to give us the right to distribute a Contributor’s code
|
||||
We ask this of all Contributors in order to assure our users of the origin and continuing existence of the code. We
|
||||
are not asking Contributors to assign copyright to us, but to give us the right to distribute a Contributor’s code
|
||||
without restriction.
|
||||
|
||||
Please make sure the CLA is signed by every Contributor prior to reviewing PRs and commits.::
|
||||
Contributors only need to sign the CLA once and should sign with the same email as used in Github. If a Contributor
|
||||
signs the CLA after a PR is submitted, they can refresh the automated CLA checker by pushing another
|
||||
signs the CLA after a PR is submitted, they can refresh the automated CLA checker by pushing another
|
||||
comment on the PR after 5 minutes of signing.
|
||||
|
||||
=== Community Administration
|
||||
|
@ -151,5 +151,5 @@ The core team is there to support the plugin Maintainers and overall ecosystem.
|
|||
|
||||
Maintainers should propose Contributors to become a Maintainer.
|
||||
|
||||
Contributors and Maintainers should follow the Elastic Community https://www.elastic.co/community/codeofconduct[Code of
|
||||
Contributors and Maintainers should follow the Elastic Community https://www.elastic.co/community/codeofconduct[Code of
|
||||
Conduct]. The core team should block or ban "bad actors".
|
|
@ -1,32 +1,32 @@
|
|||
[[multiline]]
|
||||
=== Managing Multiline Events
|
||||
|
||||
Several use cases generate events that span multiple lines of text. In order to correctly handle these multline events,
|
||||
Several use cases generate events that span multiple lines of text. In order to correctly handle these multline events,
|
||||
Logstash needs to know how to tell which lines are part of a single event.
|
||||
|
||||
Multiline event processing is complex and relies on proper event ordering. The best way to guarantee ordered log
|
||||
processing is to implement the processing as early in the pipeline as possible. The preferred tool in the Logstash
|
||||
pipeline is the {logstash}plugins-codecs-multiline.html[multiline codec], which merges lines from a single input using
|
||||
Multiline event processing is complex and relies on proper event ordering. The best way to guarantee ordered log
|
||||
processing is to implement the processing as early in the pipeline as possible. The preferred tool in the Logstash
|
||||
pipeline is the {logstash}plugins-codecs-multiline.html[multiline codec], which merges lines from a single input using
|
||||
a simple set of rules.
|
||||
|
||||
|
||||
The most important aspects of configuring either multiline plugin are the following:
|
||||
|
||||
* The `pattern` option specifies a regular expression. Lines that match the specified regular expression are considered
|
||||
either continuations of a previous line or the start of a new multiline event. You can use
|
||||
* The `pattern` option specifies a regular expression. Lines that match the specified regular expression are considered
|
||||
either continuations of a previous line or the start of a new multiline event. You can use
|
||||
{logstash}plugins-filters-grok.html[grok] regular expression templates with this configuration option.
|
||||
* The `what` option takes two values: `previous` or `next`. The `previous` value specifies that lines that match the
|
||||
value in the `pattern` option are part of the previous line. The `next` value specifies that lines that match the value
|
||||
in the `pattern` option are part of the following line.* The `negate` option applies the multiline codec to lines that
|
||||
* The `what` option takes two values: `previous` or `next`. The `previous` value specifies that lines that match the
|
||||
value in the `pattern` option are part of the previous line. The `next` value specifies that lines that match the value
|
||||
in the `pattern` option are part of the following line.* The `negate` option applies the multiline codec to lines that
|
||||
_do not_ match the regular expression specified in the `pattern` option.
|
||||
|
||||
See the full documentation for the {logstash}plugins-codecs-multiline.html[multiline codec] or the
|
||||
See the full documentation for the {logstash}plugins-codecs-multiline.html[multiline codec] or the
|
||||
{logstash}plugins-filters-multiline.html[multiline filter] plugin for more information on configuration options.
|
||||
|
||||
NOTE: For more complex needs, the {logstash}plugins-filters-multiline.html[multiline filter] performs a similar task at
|
||||
NOTE: For more complex needs, the {logstash}plugins-filters-multiline.html[multiline filter] performs a similar task at
|
||||
the filter stage of processing, where the Logstash instance aggregates multiple inputs.
|
||||
The multiline filter plugin is not thread-safe. Avoid using multiple filter workers with the multiline filter. You can
|
||||
track the progress of upgrades to the functionality of the multiline codec at
|
||||
The multiline filter plugin is not thread-safe. Avoid using multiple filter workers with the multiline filter. You can
|
||||
track the progress of upgrades to the functionality of the multiline codec at
|
||||
https://github.com/logstash-plugins/logstash-codec-multiline/issues/10[this Github issue].
|
||||
|
||||
==== Examples of Multiline Plugin Configuration
|
||||
|
@ -39,7 +39,7 @@ The examples in this section cover the following use cases:
|
|||
|
||||
===== Java Stack Traces
|
||||
|
||||
Java stack traces consist of multiple lines, with each line after the initial line beginning with whitespace, as in
|
||||
Java stack traces consist of multiple lines, with each line after the initial line beginning with whitespace, as in
|
||||
this example:
|
||||
|
||||
[source,java]
|
||||
|
@ -64,7 +64,7 @@ This configuration merges any line that begins with whitespace up to the previou
|
|||
|
||||
===== Line Continuations
|
||||
|
||||
Several programming languages use the `\` character at the end of a line to denote that the line continues, as in this
|
||||
Several programming languages use the `\` character at the end of a line to denote that the line continues, as in this
|
||||
example:
|
||||
|
||||
[source,c]
|
||||
|
@ -87,11 +87,11 @@ This configuration merges any line that ends with the `\` character with the fol
|
|||
|
||||
===== Timestamps
|
||||
|
||||
Activity logs from services such as Elasticsearch typically begin with a timestamp, followed by information on the
|
||||
Activity logs from services such as Elasticsearch typically begin with a timestamp, followed by information on the
|
||||
specific activity, as in this example:
|
||||
|
||||
[source,shell]
|
||||
[2015-08-24 11:49:14,389][INFO ][env ] [Letha] using [1] data paths, mounts [[/
|
||||
[2015-08-24 11:49:14,389][INFO ][env ] [Letha] using [1] data paths, mounts [[/
|
||||
(/dev/disk1)]], net usable_space [34.5gb], net total_space [118.9gb], types [hfs]
|
||||
|
||||
To consolidate these lines into a single event in Logstash, use the following configuration for the multiline codec:
|
||||
|
@ -108,5 +108,5 @@ input {
|
|||
}
|
||||
}
|
||||
|
||||
This configuration uses the `negate` option to specify that any line that does not begin with a timestamp belongs to
|
||||
This configuration uses the `negate` option to specify that any line that does not begin with a timestamp belongs to
|
||||
the previous line.
|
|
@ -4,10 +4,10 @@
|
|||
The Logstash <<working-with-plugins,plugin manager>> was introduced in the 1.5 release. This section discusses setting up
|
||||
local repositories of plugins for use on systems without access to the Internet.
|
||||
|
||||
The procedures in this section require a staging machine running Logstash that has access to a public or private Rubygems
|
||||
The procedures in this section require a staging machine running Logstash that has access to a public or private Rubygems
|
||||
server. This staging machine downloads and packages the files used for offline installation.
|
||||
|
||||
See the <<private-rubygem,Private Gem Repositories>> section for information on setting up your own private
|
||||
See the <<private-rubygem,Private Gem Repositories>> section for information on setting up your own private
|
||||
Rubygems server.
|
||||
|
||||
Users who can work with a larger Logstash artifact size can use the *Logstash (All Plugins)* download link from the
|
||||
|
@ -17,15 +17,15 @@ all available plugins. You can distribute this bundle to all nodes without furth
|
|||
[float]
|
||||
=== Building the Offline Package
|
||||
|
||||
Working with offline plugins requires you to create an _offline package_, which is a compressed file that contains all of
|
||||
Working with offline plugins requires you to create an _offline package_, which is a compressed file that contains all of
|
||||
the plugins your offline Logstash installation requires, along with the dependencies for those plugins.
|
||||
|
||||
. Create the offline package with the `bin/plugin pack` subcommand.
|
||||
+
|
||||
When you run the `bin/plugin pack` subcommand, Logstash creates a compressed bundle that contains all of the currently
|
||||
installed plugins and the dependencies for those plugins. By default, the compressed bundle is a GZipped TAR file when you
|
||||
run the `bin/plugin pack` subcommand on a UNIX machine. By default, when you run the `bin/plugin pack` subcommand on a
|
||||
Windows machine, the compressed bundle is a ZIP file. See <<managing-packs,Managing Plugin Packs>> for details on changing
|
||||
installed plugins and the dependencies for those plugins. By default, the compressed bundle is a GZipped TAR file when you
|
||||
run the `bin/plugin pack` subcommand on a UNIX machine. By default, when you run the `bin/plugin pack` subcommand on a
|
||||
Windows machine, the compressed bundle is a ZIP file. See <<managing-packs,Managing Plugin Packs>> for details on changing
|
||||
these default behaviors.
|
||||
+
|
||||
NOTE: Downloading all dependencies for the specified plugins may take some time, depending on the plugins listed.
|
||||
|
@ -36,7 +36,7 @@ NOTE: Downloading all dependencies for the specified plugins may take some time,
|
|||
[float]
|
||||
=== Install or Update a local plugin
|
||||
|
||||
To install or update a local plugin, use the `--local` option with the install and update commands, as in the following
|
||||
To install or update a local plugin, use the `--local` option with the install and update commands, as in the following
|
||||
examples:
|
||||
|
||||
.Installing a local plugin
|
|
@ -1,15 +1,27 @@
|
|||
[[working-with-plugins]]
|
||||
== Working with plugins
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Logstash has a rich collection of input, filter, codec and output plugins. Plugins are available as self-contained packages called gems and hosted on RubyGems.org. The plugin manager accesed via `bin/plugin` script is used to manage the lifecycle of plugins in your Logstash deployment. You can install, uninstall and upgrade plugins using these Command Line Interface (CLI) described below.
|
||||
|
||||
NOTE: Some sections here are for advanced users
|
||||
=======
|
||||
Logstash has a rich collection of input, filter, codec and output plugins. Plugins are available as self-contained
|
||||
packages called gems and hosted on RubyGems.org. The plugin manager accesed via `bin/plugin` script is used to manage the
|
||||
lifecycle of plugins in your Logstash deployment. You can install, uninstall and upgrade plugins using these Command Line
|
||||
Interface (CLI) described below.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[float]
|
||||
[[listing-plugins]]
|
||||
=== Listing plugins
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Logstash release packages bundle common plugins so you can use them out of the box. To list the plugins currently available in your deployment:
|
||||
=======
|
||||
Logstash release packages bundle common plugins so you can use them out of the box. To list the plugins currently
|
||||
available in your deployment:
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -30,7 +42,13 @@ bin/plugin list --group output <4>
|
|||
[[installing-plugins]]
|
||||
=== Adding plugins to your deployment
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
The most common situation when dealing with plugin installation is when you have access to internet. Using this method, you will be able to retrieve plugins hosted on the public repository (RubyGems.org) and install on top of your Logstash installation.
|
||||
=======
|
||||
The most common situation when dealing with plugin installation is when you have access to internet. Using this method,
|
||||
you will be able to retrieve plugins hosted on the public repository (RubyGems.org) and install on top of your Logstash
|
||||
installation.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -43,7 +61,12 @@ Once the plugin is successfully installed, you can start using it in your config
|
|||
[float]
|
||||
==== Advanced: Adding a locally built plugin
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
In some cases, you want to install plugins which have not yet been released and not hosted on RubyGems.org. Logstash provides you the option to install a locally built plugin which is packaged as a ruby gem. Using a file location:
|
||||
=======
|
||||
In some cases, you want to install plugins which have not yet been released and not hosted on RubyGems.org. Logstash
|
||||
provides you the option to install a locally built plugin which is packaged as a ruby gem. Using a file location:
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -54,7 +77,12 @@ bin/plugin install /path/to/logstash-output-kafka-1.0.0.gem
|
|||
[float]
|
||||
==== Advanced: Using `--pluginpath`
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Using the `--pluginpath` flag, you can load a plugin source code located on your file system. Typically this is used by developers who are iterating on a custom plugin and want to test it before creating a ruby gem.
|
||||
=======
|
||||
Using the `--pluginpath` flag, you can load a plugin source code located on your file system. Typically this is used by
|
||||
developers who are iterating on a custom plugin and want to test it before creating a ruby gem.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -65,7 +93,12 @@ bin/logstash --pluginpath /opt/shared/lib/logstash/input/my-custom-plugin-code.r
|
|||
[float]
|
||||
=== Updating plugins
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Plugins have their own release cycle and are often released independent of Logstash’s core release cycle. Using the update sub-command you can get the latest or update to a particular version of the plugin.
|
||||
=======
|
||||
Plugins have their own release cycle and are often released independent of Logstash’s core release cycle. Using the update
|
||||
subcommand you can get the latest or update to a particular version of the plugin.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -91,7 +124,13 @@ bin/plugin uninstall logstash-output-kafka
|
|||
[float]
|
||||
=== Proxy Support
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
The previous sections relied on Logstash being able to communicate with RubyGems.org. In certain environments, Forwarding Proxy is used to handle HTTP requests. Logstash Plugins can be installed and updated through a Proxy by setting the `HTTP_PROXY` environment variable:
|
||||
=======
|
||||
The previous sections relied on Logstash being able to communicate with RubyGems.org. In certain environments, Forwarding
|
||||
Proxy is used to handle HTTP requests. Logstash Plugins can be installed and updated through a Proxy by setting the
|
||||
`HTTP_PROXY` environment variable:
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
53
docs/static/private-gem-repo.asciidoc
vendored
Normal file
|
@ -0,0 +1,53 @@
|
|||
[[private-rubygem]]
|
||||
=== Private Gem Repositories
|
||||
|
||||
The Logstash plugin manager connects to a Ruby gems repository to install and update Logstash plugins. By default, this
|
||||
repository is http://rubygems.org.
|
||||
|
||||
Some use cases are unable to use the default repository, as in the following examples:
|
||||
|
||||
* A firewall blocks access to the default repository.
|
||||
* You are developing your own plugins locally.
|
||||
* Airgap requirements on the local system.
|
||||
|
||||
When you use a custom gem repository, be sure to make plugin dependencies available.
|
||||
|
||||
Several open source projects enable you to run your own plugin server, among them:
|
||||
|
||||
* https://github.com/geminabox/geminabox[Geminabox]
|
||||
* https://github.com/PierreRambaud/gemirro[Gemirro]
|
||||
* https://gemfury.com/[Gemfury]
|
||||
* http://www.jfrog.com/open-source/[Artifactory]
|
||||
|
||||
==== Editing the Gemfile
|
||||
|
||||
The gemfile is a configuration file that specifies information required for plugin management. Each gem file has a
|
||||
`source` line that specifies a location for plugin content.
|
||||
|
||||
By default, the gemfile's `source` line reads:
|
||||
|
||||
[source,shell]
|
||||
----------
|
||||
# This is a Logstash generated Gemfile.
|
||||
# If you modify this file manually all comments and formatting will be lost.
|
||||
|
||||
source "https://rubygems.org"
|
||||
----------
|
||||
|
||||
To change the source, edit the `source` line to contain your preferred source, as in the following example:
|
||||
|
||||
[source,shell]
|
||||
----------
|
||||
# This is a Logstash generated Gemfile.
|
||||
# If you modify this file manually all comments and formatting will be lost.
|
||||
|
||||
source "https://my.private.repository"
|
||||
----------
|
||||
|
||||
After saving the new version of the gemfile, use <<working-with-plugins,plugin management commands>> normally.
|
||||
|
||||
The following links contain further material on setting up some commonly used repositories:
|
||||
|
||||
* https://github.com/geminabox/geminabox/blob/master/README.markdown[Geminabox]
|
||||
* https://www.jfrog.com/confluence/display/RTF/RubyGems+Repositories[Artifactory]
|
||||
* Running a http://guides.rubygems.org/run-your-own-gem-server/[rubygems mirror]
|
|
@ -4,16 +4,16 @@
|
|||
[float]
|
||||
== General
|
||||
|
||||
* {lsissue}2376[Issue 2376]: Added ability to install and upgrade Logstash plugins without requiring internet
|
||||
connectivity.
|
||||
* {lsissue}2376[Issue 2376]: Added ability to install and upgrade Logstash plugins without requiring internet
|
||||
connectivity.
|
||||
* {lsissue}3576[Issue 3576]: Support alternate or private Ruby gems server to install and update plugins.
|
||||
* {lsissue}3451[Issue 3451]: Added ability to reliably shutdown Logstash when there is a stall in event processing. This
|
||||
* {lsissue}3451[Issue 3451]: Added ability to reliably shutdown Logstash when there is a stall in event processing. This
|
||||
option can be enabled by passing `--allow-unsafe-shutdown` flag while starting Logstash. Please be aware that any in-
|
||||
flight events will be lost when shutdown happens.
|
||||
* {lsissue}4222[Issue 4222]: Fixed a memory leak which could be triggered when events having a date were serialized to
|
||||
* {lsissue}4222[Issue 4222]: Fixed a memory leak which could be triggered when events having a date were serialized to
|
||||
string.
|
||||
* Added JDBC input to default package.
|
||||
* {lsissue}3243[Issue 3243]: Adding `--debug` to `--configtest` now shows the configuration in blocks annotated by source
|
||||
* {lsissue}3243[Issue 3243]: Adding `--debug` to `--configtest` now shows the configuration in blocks annotated by source
|
||||
config file. Very useful when using multiple config files in a directory.
|
||||
* {lsissue}4130[Issue 4130]: Reset default worker threads to 1 when using non thread-safe filters like multiline.
|
||||
* Fixed file permissions for the `logrotate` configuration file.
|
||||
|
@ -25,29 +25,29 @@ config file. Very useful when using multiple config files in a directory.
|
|||
|
||||
[float]
|
||||
=== Twitter
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/21[Issue 21]: Added an option to fetch data from the
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/21[Issue 21]: Added an option to fetch data from the
|
||||
sample Twitter streaming endpoint.
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/22[Issue 22]: Added hashtags, symbols and
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/22[Issue 22]: Added hashtags, symbols and
|
||||
user_mentions as data for the non extended tweet event.
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/20[Issue 20]: Added an option to filter per location
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/20[Issue 20]: Added an option to filter per location
|
||||
and language.
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/11[Issue 11]: Added an option to stream data from a
|
||||
* https://github.com/logstash-plugins/logstash-input-twitter/issues/11[Issue 11]: Added an option to stream data from a
|
||||
list of users.
|
||||
|
||||
[float]
|
||||
=== Beats
|
||||
* https://github.com/logstash-plugins/logstash-input-beats/issues/10[Issue 10]: Properly handle multiline events from
|
||||
* https://github.com/logstash-plugins/logstash-input-beats/issues/10[Issue 10]: Properly handle multiline events from
|
||||
multiple sources, originating from Filebeat.
|
||||
|
||||
[float]
|
||||
=== File
|
||||
* https://github.com/logstash-plugins/logstash-input-file/issues/44[Issue 44]: Properly handle multiline events from
|
||||
* https://github.com/logstash-plugins/logstash-input-file/issues/44[Issue 44]: Properly handle multiline events from
|
||||
multiple sources.
|
||||
|
||||
[float]
|
||||
=== Eventlog
|
||||
* https://github.com/logstash-plugins/logstash-input-eventlog/issues/11[Issue 11]: Change the underlying library to
|
||||
capture Event Logs from Windows more reliably.
|
||||
* https://github.com/logstash-plugins/logstash-input-eventlog/issues/11[Issue 11]: Change the underlying library to
|
||||
capture Event Logs from Windows more reliably.
|
||||
|
||||
[float]
|
||||
== Output
|
||||
|
@ -56,6 +56,6 @@ capture Event Logs from Windows more reliably.
|
|||
=== Elasticsearch
|
||||
* Improved the default template to use doc_values wherever possible.
|
||||
* Improved the default template to disable fielddata on analyzed string fields.
|
||||
* https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/260[Issue 260]: Added New setting: timeout.
|
||||
This lets you control the behavior of a slow/stuck request to Elasticsearch that could be, for example, caused by network,
|
||||
* https://github.com/logstash-plugins/logstash-output-elasticsearch/issues/260[Issue 260]: Added New setting: timeout.
|
||||
This lets you control the behavior of a slow/stuck request to Elasticsearch that could be, for example, caused by network,
|
||||
firewall, or load balancer issues.
|
|
@ -5,8 +5,8 @@ We also have repositories available for APT and YUM based distributions. Note
|
|||
that we only provide binary packages, but no source packages, as the packages
|
||||
are created as part of the Logstash build.
|
||||
|
||||
We have split the Logstash package repositories by version into separate urls
|
||||
to avoid accidental upgrades across major or minor versions. For all 1.5.x
|
||||
We have split the Logstash package repositories by version into separate urls
|
||||
to avoid accidental upgrades across major or minor versions. For all 1.5.x
|
||||
releases use 1.5 as version number, for 1.4.x use 1.4, etc.
|
||||
|
||||
We use the PGP key
|
|
@ -77,4 +77,3 @@ of workers by passing a command line flag such as:
|
|||
|
||||
[source,shell]
|
||||
bin/logstash `-w 1`
|
||||
|
|
@ -1,35 +0,0 @@
|
|||
input {
|
||||
tcp {
|
||||
type => "apache"
|
||||
port => 3333
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [type] == "apache" {
|
||||
grok {
|
||||
# See the following URL for a complete list of named patterns
|
||||
# logstash/grok ships with by default:
|
||||
# https://github.com/logstash/logstash/tree/master/patterns
|
||||
#
|
||||
# The grok filter will use the below pattern and on successful match use
|
||||
# any captured values as new fields in the event.
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
|
||||
date {
|
||||
# Try to pull the timestamp from the 'timestamp' field (parsed above with
|
||||
# grok). The apache time format looks like: "18/Aug/2011:05:44:34 -0700"
|
||||
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch {
|
||||
# Setting 'embedded' will run a real elasticsearch server inside logstash.
|
||||
# This option below saves you from having to run a separate process just
|
||||
# for ElasticSearch, so you can get started quicker!
|
||||
embedded => true
|
||||
}
|
||||
}
|
|
@ -1,33 +0,0 @@
|
|||
input {
|
||||
tcp {
|
||||
type => "apache"
|
||||
port => 3333
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [type] == "apache" {
|
||||
grok {
|
||||
# See the following URL for a complete list of named patterns
|
||||
# logstash/grok ships with by default:
|
||||
# https://github.com/logstash/logstash/tree/master/patterns
|
||||
#
|
||||
# The grok filter will use the below pattern and on successful match use
|
||||
# any captured values as new fields in the event.
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
|
||||
date {
|
||||
# Try to pull the timestamp from the 'timestamp' field (parsed above with
|
||||
# grok). The apache time format looks like: "18/Aug/2011:05:44:34 -0700"
|
||||
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
# Use stdout in debug mode again to see what logstash makes of the event.
|
||||
stdout {
|
||||
codec => rubydebug
|
||||
}
|
||||
}
|
|
@ -1 +0,0 @@
|
|||
129.92.249.70 - - [18/Aug/2011:06:00:14 -0700] "GET /style2.css HTTP/1.1" 200 1820 "http://www.semicomplete.com/blog/geekery/bypassing-captive-portals.html" "Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5"
|
|
@ -1,25 +0,0 @@
|
|||
input {
|
||||
stdin {
|
||||
# A type is a label applied to an event. It is used later with filters
|
||||
# to restrict what filters are run against each event.
|
||||
type => "human"
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
# Print each event to stdout.
|
||||
stdout {
|
||||
# Enabling 'rubydebug' codec on the stdout output will make logstash
|
||||
# pretty-print the entire event as something similar to a JSON representation.
|
||||
codec => rubydebug
|
||||
}
|
||||
|
||||
# You can have multiple outputs. All events generally to all outputs.
|
||||
# Output events to elasticsearch
|
||||
elasticsearch {
|
||||
# Setting 'embedded' will run a real elasticsearch server inside logstash.
|
||||
# This option below saves you from having to run a separate process just
|
||||
# for ElasticSearch, so you can get started quicker!
|
||||
embedded => true
|
||||
}
|
||||
}
|
|
@ -1,16 +0,0 @@
|
|||
input {
|
||||
stdin {
|
||||
# A type is a label applied to an event. It is used later with filters
|
||||
# to restrict what filters are run against each event.
|
||||
type => "human"
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
# Print each event to stdout.
|
||||
stdout {
|
||||
# Enabling 'rubydebug' codec on the stdout output will make logstash
|
||||
# pretty-print the entire event as something similar to a JSON representation.
|
||||
codec => rubydebug
|
||||
}
|
||||
}
|
|
@ -1,101 +0,0 @@
|
|||
---
|
||||
title: Logstash 10-Minute Tutorial
|
||||
layout: content_right
|
||||
---
|
||||
# Logstash 10-minute Tutorial
|
||||
|
||||
## Step 1 - Download
|
||||
|
||||
### Download logstash:
|
||||
|
||||
* [logstash-%VERSION%.tar.gz](https://download.elasticsearch.org/logstash/logstash/logstash-%VERSION%.tar.gz)
|
||||
|
||||
curl -O https://download.elasticsearch.org/logstash/logstash/logstash-%VERSION%.tar.gz
|
||||
|
||||
### Unpack it
|
||||
|
||||
tar -xzf logstash-%VERSION%.tar.gz
|
||||
cd logstash-%VERSION%
|
||||
|
||||
### Requirements:
|
||||
|
||||
* Java
|
||||
|
||||
### The Secret:
|
||||
|
||||
Logstash is written in JRuby, but I release standalone jar files for easy
|
||||
deployment, so you don't need to download JRuby or most any other dependencies.
|
||||
|
||||
I bake as much as possible into the single release file.
|
||||
|
||||
## Step 2 - A hello world.
|
||||
|
||||
### Download this config file:
|
||||
|
||||
* [hello.conf](hello.conf)
|
||||
|
||||
### Run it:
|
||||
|
||||
bin/logstash agent -f hello.conf
|
||||
|
||||
Type stuff on standard input. Press enter. Watch what event Logstash sees.
|
||||
Press ^C to kill it.
|
||||
|
||||
## Step 3 - Add ElasticSearch
|
||||
|
||||
### Download this config file:
|
||||
|
||||
* [hello-search.conf](hello-search.conf)
|
||||
|
||||
### Run it:
|
||||
|
||||
bin/logstash agent -f hello-search.conf
|
||||
|
||||
Same config as step 2, but now we are also writing events to ElasticSearch. Do
|
||||
a search for `*` (all):
|
||||
|
||||
curl 'http://localhost:9200/_search?pretty=1&q=*'
|
||||
|
||||
### Download
|
||||
|
||||
* [apache-parse.conf](apache-parse.conf)
|
||||
* [apache_log.1](apache_log.1) (a single apache log line)
|
||||
|
||||
### Run it
|
||||
|
||||
bin/logstash agent -f apache-parse.conf
|
||||
|
||||
Logstash will now be listening on TCP port 3333. Send an Apache log message at it:
|
||||
|
||||
nc localhost 3333 < apache_log.1
|
||||
|
||||
The expected output can be viewed here: [step-5-output.txt](step-5-output.txt)
|
||||
|
||||
## Step 6 - real world example + search
|
||||
|
||||
Same as the previous step, but we'll output to ElasticSearch now.
|
||||
|
||||
### Download
|
||||
|
||||
* [apache-elasticsearch.conf](apache-elasticsearch.conf)
|
||||
* [apache_log.2.bz2](apache_log.2.bz2) (2 days of apache logs)
|
||||
|
||||
### Run it
|
||||
|
||||
bin/logstash agent -f apache-elasticsearch.conf
|
||||
|
||||
Logstash should be all set for you now. Start feeding it logs:
|
||||
|
||||
bzip2 -d apache_log.2.bz2
|
||||
|
||||
nc localhost 3333 < apache_log.2
|
||||
|
||||
## Want more?
|
||||
|
||||
For further learning, try these:
|
||||
|
||||
* [Watch a presentation on logstash](http://www.youtube.com/embed/RuUFnog29M4)
|
||||
* [Getting started 'standalone' guide](http://logstash.net/docs/%VERSION%/tutorials/getting-started-simple)
|
||||
* [Getting started 'centralized' guide](http://logstash.net/docs/%VERSION%/tutorials/getting-started-centralized) -
|
||||
learn how to build out your logstash infrastructure and centralize your logs.
|
||||
* [Dive into the docs](http://logstash.net/docs/%VERSION%/)
|
|
@ -1,17 +0,0 @@
|
|||
{
|
||||
"type" => "apache",
|
||||
"clientip" => "129.92.249.70",
|
||||
"ident" => "-",
|
||||
"auth" => "-",
|
||||
"timestamp" => "18/Aug/2011:06:00:14 -0700",
|
||||
"verb" => "GET",
|
||||
"request" => "/style2.css",
|
||||
"httpversion" => "1.1",
|
||||
"response" => "200",
|
||||
"bytes" => "1820",
|
||||
"referrer" => "http://www.semicomplete.com/blog/geekery/bypassing-captive-portals.html",
|
||||
"agent" => "\"Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5\"",
|
||||
"@timestamp" => "2011-08-18T13:00:14.000Z",
|
||||
"host" => "127.0.0.1",
|
||||
"message" => "129.92.249.70 - - [18/Aug/2011:06:00:14 -0700] \"GET /style2.css HTTP/1.1\" 200 1820 \"http://www.semicomplete.com/blog/geekery/bypassing-captive-portals.html\" \"Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5\"\n"
|
||||
}
|
|
@ -1,436 +0,0 @@
|
|||
= Getting Started with Logstash
|
||||
|
||||
== Introduction
|
||||
Logstash is a tool for receiving, processing and outputting logs. All kinds of logs. System logs, webserver logs, error logs, application logs and just about anything you can throw at it. Sounds great, eh?
|
||||
|
||||
Using Elasticsearch as a backend datastore, and kibana as a frontend reporting tool, Logstash acts as the workhorse, creating a powerful pipeline for storing, querying and analyzing your logs. With an arsenal of built-in inputs, filters, codecs and outputs, you can harness some powerful functionality with a small amount of effort. So, let's get started!
|
||||
|
||||
=== Prerequisite: Java
|
||||
The only prerequisite required by Logstash is a Java runtime. You can check that you have it installed by running the command `java -version` in your shell. Here's something similar to what you might see:
|
||||
----
|
||||
> java -version
|
||||
java version "1.7.0_45"
|
||||
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
|
||||
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
|
||||
----
|
||||
It is recommended to run a recent version of Java in order to ensure the greatest success in running Logstash.
|
||||
|
||||
It's fine to run an open-source version such as OpenJDK: +
|
||||
http://openjdk.java.net/
|
||||
|
||||
Or you can use the official Oracle version: +
|
||||
http://www.oracle.com/technetwork/java/index.html
|
||||
|
||||
Once you have verified the existence of Java on your system, we can move on!
|
||||
|
||||
== Up and Running!
|
||||
|
||||
=== Logstash in two commands
|
||||
First, we're going to download the 'logstash' binary and run it with a very simple configuration.
|
||||
----
|
||||
curl -O https://download.elasticsearch.org/logstash/logstash/logstash-%VERSION%.tar.gz
|
||||
----
|
||||
Now you should have the file named 'logstash-%VERSION%.tar.gz' on your local filesystem. Let's unpack it:
|
||||
----
|
||||
tar zxvf logstash-%VERSION%.tar.gz
|
||||
cd logstash-%VERSION%
|
||||
----
|
||||
Here, we are telling the *tar* command that we are sending it a gzipped file (*z* flag), that we would like to extract the file (*x* flag), that we would like to do so verbosely (*v* flag), and that we will provide a filename for *tar* (*f* flag).
|
||||
|
||||
Now let's run it:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { stdout {} }'
|
||||
----
|
||||
|
||||
Now type something into your command prompt, and you will see it output by Logstash:
|
||||
----
|
||||
hello world
|
||||
2013-11-21T01:22:14.405+0000 0.0.0.0 hello world
|
||||
----
|
||||
|
||||
OK, that's interesting... We ran Logstash with an input called "stdin", and an output named "stdout", and Logstash basically echoed back whatever we typed in some sort of structured format. Note that specifying the *-e* command line flag allows Logstash to accept a configuration directly from the command line. This is especially useful for quickly testing configurations without having to edit a file between iterations.
|
||||
|
||||
Let's try a slightly fancier example. First, you should exit Logstash by issuing a 'CTRL-D' command (or 'CTRL-C Enter') in the shell in which it is running. Now run Logstash again with the following command:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { stdout { codec => rubydebug } }'
|
||||
----
|
||||
|
||||
And then try another test input, typing the text "goodnight moon":
|
||||
----
|
||||
goodnight moon
|
||||
{
|
||||
"message" => "goodnight moon",
|
||||
"@timestamp" => "2013-11-20T23:48:05.335Z",
|
||||
"@version" => "1",
|
||||
"host" => "my-laptop"
|
||||
}
|
||||
----
|
||||
|
||||
So, by re-configuring the "stdout" output (adding a "codec"), we can change the output of Logstash. By adding inputs, outputs and filters to your configuration, it's possible to massage the log data in many ways, in order to maximize flexibility of the stored data when you are querying it.
|
||||
|
||||
== Storing logs with Elasticsearch
|
||||
Now, you're probably saying, "that's all fine and dandy, but typing all my logs into Logstash isn't really an option, and merely seeing them spit to STDOUT isn't very useful." Good point. First, let's set up Elasticsearch to store the messages we send into Logstash. If you don't have Elasticsearch already installed, you can http://www.elasticsearch.org/download/[download the RPM or DEB package], or install manually by downloading the current release tarball, by issuing the following four commands:
|
||||
----
|
||||
curl -O https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-%ELASTICSEARCH_VERSION%.tar.gz
|
||||
tar zxvf elasticsearch-%ELASTICSEARCH_VERSION%.tar.gz
|
||||
cd elasticsearch-%ELASTICSEARCH_VERSION%/
|
||||
./bin/elasticsearch
|
||||
----
|
||||
|
||||
NOTE: This tutorial specifies running Logstash %VERSION% with Elasticsearch %ELASTICSEARCH_VERSION%. Each release of Logstash has a *recommended* version of Elasticsearch to pair with. Make sure the versions match based on the http://www.elasticsearch.org/overview/logstash[Logstash version] you're running!
|
||||
|
||||
More detailed information on installing and configuring Elasticsearch can be found on http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index.html[The Elasticsearch reference pages]. However, for the purposes of Getting Started with Logstash, the default installation and configuration of Elasticsearch should be sufficient.
|
||||
|
||||
Now that we have Elasticsearch running on port 9200 (we do, right?), Logstash can be simply configured to use Elasticsearch as its backend. The defaults for both Logstash and Elasticsearch are fairly sane and well thought out, so we can omit the optional configurations within the elasticsearch output:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } }'
|
||||
----
|
||||
|
||||
Type something, and Logstash will process it as before (this time you won't see any output, since we don't have the stdout output configured)
|
||||
----
|
||||
you know, for logs
|
||||
----
|
||||
|
||||
You can confirm that ES actually received the data by making a curl request and inspecting the return:
|
||||
----
|
||||
curl 'http://localhost:9200/_search?pretty'
|
||||
----
|
||||
|
||||
which should return something like this:
|
||||
----
|
||||
{
|
||||
"took" : 2,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 5,
|
||||
"successful" : 5,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : 1,
|
||||
"max_score" : 1.0,
|
||||
"hits" : [ {
|
||||
"_index" : "logstash-2013.11.21",
|
||||
"_type" : "logs",
|
||||
"_id" : "2ijaoKqARqGvbMgP3BspJA",
|
||||
"_score" : 1.0, "_source" : {"message":"you know, for logs","@timestamp":"2013-11-21T18:45:09.862Z","@version":"1","host":"my-laptop"}
|
||||
} ]
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
Congratulations! You've successfully stashed logs in Elasticsearch via Logstash.
|
||||
|
||||
=== Elasticsearch Plugins (an aside)
|
||||
Another very useful tool for querying your Logstash data (and Elasticsearch in general) is the Elasticsearch-kopf plugin. Here is more information on http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-plugins.html[Elasticsearch plugins]. To install elasticsearch-kopf, simply issue the following command in your Elasticsearch directory (the same one in which you ran Elasticsearch earlier):
|
||||
----
|
||||
bin/plugin -install lmenezes/elasticsearch-kopf
|
||||
----
|
||||
Now you can browse to http://localhost:9200/_plugin/kopf[http://localhost:9200/_plugin/kopf] to browse your Elasticsearch data, settings and mappings!
|
||||
|
||||
=== Multiple Outputs
|
||||
As a quick exercise in configuring multiple Logstash outputs, let's invoke Logstash again, using both the 'stdout' as well as the 'elasticsearch' output:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } stdout { } }'
|
||||
----
|
||||
Typing a phrase will now echo back to your terminal, as well as save in Elasticsearch! (Feel free to verify this using curl, kibana or elasticsearch-kopf).
|
||||
|
||||
=== Default - Daily Indices
|
||||
You might notice that Logstash was smart enough to create a new index in Elasticsearch... The default index name is in the form of 'logstash-YYYY.MM.DD', which essentially creates one index per day. At midnight (GMT?), Logstash will automagically rotate the index to a fresh new one, with the new current day's timestamp. This allows you to keep windows of data, based on how far retroactively you'd like to query your log data. Of course, you can always archive (or re-index) your data to an alternate location, where you are able to query further into the past. If you'd like to simply delete old indices after a certain time period, you can use the https://github.com/elasticsearch/curator[Elasticsearch Curator tool].
|
||||
|
||||
== Moving On
|
||||
Now you're ready for more advanced configurations. At this point, it makes sense for a quick discussion of some of the core features of Logstash, and how they interact with the Logstash engine.
|
||||
|
||||
=== The Life of an Event
|
||||
|
||||
Inputs, Outputs, Codecs and Filters are at the heart of the Logstash configuration. By creating a pipeline of event processing, Logstash is able to extract the relevant data from your logs and make it available to elasticsearch, in order to efficiently query your data. To get you thinking about the various options available in Logstash, let's discuss some of the more common configurations currently in use. For more details, read about http://logstash.net/docs/latest/life-of-an-event[the Logstash event pipeline].
|
||||
|
||||
==== Inputs
|
||||
Inputs are the mechanism for passing log data to Logstash. Some of the more useful, commonly-used ones are:
|
||||
|
||||
* *file*: reads from a file on the filesystem, much like the UNIX command "tail -0F"
|
||||
* *syslog*: listens on the well-known port 514 for syslog messages and parses according to RFC3164 format
|
||||
* *redis*: reads from a redis server, using both redis channels and also redis lists. Redis is often used as a "broker" in a centralized Logstash installation, which queues Logstash events from remote Logstash "shippers".
|
||||
* *lumberjack*: processes events sent in the lumberjack protocol. Now called https://github.com/elasticsearch/logstash-forwarder[logstash-forwarder].
|
||||
|
||||
==== Filters
|
||||
Filters are used as intermediary processing devices in the Logstash chain. They are often combined with conditionals in order to perform a certain action on an event, if it matches particular criteria. Some useful filters:
|
||||
|
||||
* *grok*: parses arbitrary text and structure it. Grok is currently the best way in Logstash to parse unstructured log data into something structured and queryable. With 120 patterns shipped built-in to Logstash, it's more than likely you'll find one that meets your needs!
|
||||
* *mutate*: The mutate filter allows you to do general mutations to fields. You can rename, remove, replace, and modify fields in your events.
|
||||
* *drop*: drop an event completely, for example, 'debug' events.
|
||||
* *clone*: make a copy of an event, possibly adding or removing fields.
|
||||
* *geoip*: adds information about geographical location of IP addresses (and displays amazing charts in kibana)
|
||||
|
||||
==== Outputs
|
||||
Outputs are the final phase of the Logstash pipeline. An event may pass through multiple outputs during processing, but once all outputs are complete, the event has finished its execution. Some commonly used outputs include:
|
||||
|
||||
* *elasticsearch*: If you're planning to save your data in an efficient, convenient and easily queryable format... Elasticsearch is the way to go. Period. Yes, we're biased :)
|
||||
* *file*: writes event data to a file on disk.
|
||||
* *graphite*: sends event data to graphite, a popular open source tool for storing and graphing metrics. http://graphite.wikidot.com/
|
||||
* *statsd*: a service which "listens for statistics, like counters and timers, sent over UDP and sends aggregates to one or more pluggable backend services". If you're already using statsd, this could be useful for you!
|
||||
|
||||
==== Codecs
|
||||
Codecs are basically stream filters which can operate as part of an input, or an output. Codecs allow you to easily separate the transport of your messages from the serialization process. Popular codecs include 'json', 'msgpack' and 'plain' (text).
|
||||
|
||||
* *json*: encode / decode data in JSON format
|
||||
* *multiline*: Takes multiple-line text events and merge them into a single event, e.g. java exception and stacktrace messages
|
||||
|
||||
For the complete list of (current) configurations, visit the Logstash "plugin configuration" section of the http://www.elasticsearch.org/overview/logstash[Logstash documentation page].
|
||||
|
||||
|
||||
== More fun with Logstash
|
||||
=== Persistent Configuration files
|
||||
|
||||
Specifying configurations on the command line using '-e' is only so helpful, and more advanced setups will require more lengthy, long-lived configurations. First, let's create a simple configuration file, and invoke Logstash using it. Create a file named "logstash-simple.conf" and save it in the same directory as Logstash.
|
||||
|
||||
----
|
||||
input { stdin { } }
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
|
||||
Then, run this command:
|
||||
|
||||
----
|
||||
bin/logstash -f logstash-simple.conf
|
||||
----
|
||||
|
||||
Et voilà! Logstash will read in the configuration file you just created and run as in the example we saw earlier. Note that we used the '-f' to read in the file, rather than the '-e' to read the configuration from the command line. This is a very simple case, of course, so let's move on to some more complex examples.
|
||||
|
||||
=== Testing Your Configuration Files
|
||||
|
||||
After creating a new or complex configuration file, it can be helpful to quickly test that the file is formatted correctly. We can verify our configuration file is formatted correctly by using the *--configtest* flag.
|
||||
|
||||
----
|
||||
bin/logstash -f logstash-simple.conf --configtest
|
||||
----
|
||||
|
||||
=== Filters
|
||||
Filters are an in-line processing mechanism which provide the flexibility to slice and dice your data to fit your needs. Let's see one in action, namely the *grok filter*.
|
||||
|
||||
----
|
||||
input { stdin { } }
|
||||
|
||||
filter {
|
||||
grok {
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
date {
|
||||
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
Run Logstash with this configuration:
|
||||
|
||||
----
|
||||
bin/logstash -f logstash-filter.conf
|
||||
----
|
||||
|
||||
Now paste this line into the terminal (so it will be processed by the stdin input):
|
||||
----
|
||||
127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"
|
||||
----
|
||||
You should see something returned to STDOUT which looks like this:
|
||||
----
|
||||
{
|
||||
"message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
|
||||
"@timestamp" => "2013-12-11T08:01:45.000Z",
|
||||
"@version" => "1",
|
||||
"host" => "cadenza",
|
||||
"clientip" => "127.0.0.1",
|
||||
"ident" => "-",
|
||||
"auth" => "-",
|
||||
"timestamp" => "11/Dec/2013:00:01:45 -0800",
|
||||
"verb" => "GET",
|
||||
"request" => "/xampp/status.php",
|
||||
"httpversion" => "1.1",
|
||||
"response" => "200",
|
||||
"bytes" => "3891",
|
||||
"referrer" => "\"http://cadenza/xampp/navi.php\"",
|
||||
"agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\""
|
||||
}
|
||||
----
|
||||
As you can see, Logstash (with help from the *grok* filter) was able to parse the log line (which happens to be in Apache "combined log" format) and break it up into many different discrete bits of information. This will be extremely useful later when we start querying and analyzing our log data... for example, we'll be able to run reports on HTTP response codes, IP addresses, referrers, etc. very easily. There are quite a few grok patterns included with Logstash out-of-the-box, so it's quite likely if you're attempting to parse a fairly common log format, someone has already done the work for you. For more details, see the list of https://github.com/logstash/logstash/blob/master/patterns/grok-patterns[logstash grok patterns] on github.
|
||||
|
||||
The other filter used in this example is the *date* filter. This filter parses out a timestamp and uses it as the timestamp for the event (regardless of when you're ingesting the log data). You'll notice that the @timestamp field in this example is set to December 11, 2013, even though Logstash is ingesting the event at some point afterwards. This is handy when backfilling logs, for example... the ability to tell Logstash "use this value as the timestamp for this event". For non-english installation you may have to precise the locale in date filter (locale => en).
|
||||
|
||||
== Useful Examples
|
||||
|
||||
=== Apache logs (from files)
|
||||
Now, let's configure something actually *useful*... apache2 access log files! We are going to read the input from a file on the localhost, and use a *conditional* to process the event according to our needs. First, create a file called something like 'logstash-apache.conf' with the following contents (you'll need to change the log's file path to suit your needs):
|
||||
|
||||
----
|
||||
input {
|
||||
file {
|
||||
path => "/tmp/access_log"
|
||||
start_position => "beginning"
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [path] =~ "access" {
|
||||
mutate { replace => { "type" => "apache_access" } }
|
||||
grok {
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
}
|
||||
date {
|
||||
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch {
|
||||
host => localhost
|
||||
}
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
|
||||
----
|
||||
Then, create the file you configured above (in this example, "/tmp/access_log") with the following log lines as contents (or use some from your own webserver):
|
||||
|
||||
----
|
||||
71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] "GET /admin HTTP/1.1" 301 566 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
|
||||
134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"
|
||||
98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
|
||||
----
|
||||
|
||||
Now run it with the -f flag as in the last example:
|
||||
----
|
||||
bin/logstash -f logstash-apache.conf
|
||||
----
|
||||
You should be able to see your apache log data in Elasticsearch now! You'll notice that Logstash opened the file you configured, and read through it, processing any events it encountered. Any additional lines logged to this file will also be captured, processed by Logstash as events and stored in Elasticsearch. As an added bonus, they will be stashed with the field "type" set to "apache_access" (this is done by the type => "apache_access" line in the input configuration).
|
||||
|
||||
In this configuration, Logstash is only watching the apache access_log, but it's easy enough to watch both the access_log and the error_log (actually, any file matching '*log'), by changing one line in the above configuration, like this:
|
||||
|
||||
----
|
||||
input {
|
||||
file {
|
||||
path => "/tmp/*_log"
|
||||
...
|
||||
----
|
||||
Now, rerun Logstash, and you will see both the error and access logs processed via Logstash. However, if you inspect your data (using elasticsearch-kopf, perhaps), you will see that the access_log was broken up into discrete fields, but not the error_log. That's because we used a "grok" filter to match the standard combined apache log format and automatically split the data into separate fields. Wouldn't it be nice *if* we could control how a line was parsed, based on its format? Well, we can...
|
||||
|
||||
Also, you might have noticed that Logstash did not reprocess the events which were already seen in the access_log file. Logstash is able to save its position in files, only processing new lines as they are added to the file. Neat!
|
||||
|
||||
=== Conditionals
|
||||
Now we can build on the previous example, where we introduced the concept of a *conditional*. A conditional should be familiar to most Logstash users, in the general sense. You may use 'if', 'else if' and 'else' statements, as in many other programming languages. Let's label each event according to which file it appeared in (access_log, error_log and other random files which end with "log").
|
||||
|
||||
----
|
||||
input {
|
||||
file {
|
||||
path => "/tmp/*_log"
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [path] =~ "access" {
|
||||
mutate { replace => { type => "apache_access" } }
|
||||
grok {
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
date {
|
||||
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
} else if [path] =~ "error" {
|
||||
mutate { replace => { type => "apache_error" } }
|
||||
} else {
|
||||
mutate { replace => { type => "random_logs" } }
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
|
||||
You'll notice we've labeled all events using the "type" field, but we didn't actually parse the "error" or "random" files... There are so many types of error logs that it's better left as an exercise for you, depending on the logs you're seeing.
|
||||
|
||||
=== Syslog
|
||||
OK, now we can move on to another incredibly useful example: *syslog*. Syslog is one of the most common use cases for Logstash, and one it handles exceedingly well (as long as the log lines conform roughly to RFC3164 :). Syslog is the de facto UNIX networked logging standard, sending messages from client machines to a local file, or to a centralized log server via rsyslog. For this example, you won't need a functioning syslog instance; we'll fake it from the command line, so you can get a feel for what happens.
|
||||
|
||||
First, let's make a simple configuration file for Logstash + syslog, called 'logstash-syslog.conf'.
|
||||
|
||||
----
|
||||
input {
|
||||
tcp {
|
||||
port => 5000
|
||||
type => syslog
|
||||
}
|
||||
udp {
|
||||
port => 5000
|
||||
type => syslog
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [type] == "syslog" {
|
||||
grok {
|
||||
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
|
||||
add_field => [ "received_at", "%{@timestamp}" ]
|
||||
add_field => [ "received_from", "%{host}" ]
|
||||
}
|
||||
syslog_pri { }
|
||||
date {
|
||||
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
Run it as normal:
|
||||
----
|
||||
bin/logstash -f logstash-syslog.conf
|
||||
----
|
||||
Normally, a client machine would connect to the Logstash instance on port 5000 and send its message. In this simplified case, we're simply going to telnet to Logstash and enter a log line (similar to how we entered log lines into STDIN earlier). First, open another shell window to interact with the Logstash syslog input and type the following command:
|
||||
|
||||
----
|
||||
telnet localhost 5000
|
||||
----
|
||||
|
||||
You can copy and paste the following lines as samples (feel free to try some of your own, but keep in mind they might not parse if the grok filter is not correct for your data):
|
||||
|
||||
----
|
||||
Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]
|
||||
Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied
|
||||
Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)
|
||||
Dec 22 18:28:06 louis rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="2253" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.
|
||||
----
|
||||
|
||||
Now you should see the output of Logstash in your original shell as it processes and parses messages!
|
||||
|
||||
----
|
||||
{
|
||||
"message" => "Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)",
|
||||
"@timestamp" => "2013-12-23T22:30:01.000Z",
|
||||
"@version" => "1",
|
||||
"type" => "syslog",
|
||||
"host" => "0:0:0:0:0:0:0:1:52617",
|
||||
"syslog_timestamp" => "Dec 23 14:30:01",
|
||||
"syslog_hostname" => "louis",
|
||||
"syslog_program" => "CRON",
|
||||
"syslog_pid" => "619",
|
||||
"syslog_message" => "(www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)",
|
||||
"received_at" => "2013-12-23 22:49:22 UTC",
|
||||
"received_from" => "0:0:0:0:0:0:0:1:52617",
|
||||
"syslog_severity_code" => 5,
|
||||
"syslog_facility_code" => 1,
|
||||
"syslog_facility" => "user-level",
|
||||
"syslog_severity" => "notice"
|
||||
}
|
||||
----
|
||||
|
||||
Congratulations! You're well on your way to being a real Logstash power user. You should be comfortable configuring, running and sending events to Logstash, but there's much more to explore.
|
|
@ -1,201 +0,0 @@
|
|||
---
|
||||
title: Just Enough RabbitMQ - logstash
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
While configuring your RabbitMQ broker is out of scope for logstash, it's important
|
||||
to understand how logstash uses RabbitMQ. To do that, we need to understand a
|
||||
little about AMQP.
|
||||
|
||||
You should also consider reading
|
||||
[this](http://www.rabbitmq.com/tutorials/amqp-concepts.html) at the RabbitMQ
|
||||
website.
|
||||
|
||||
# Exchanges, queues and bindings; OH MY!
|
||||
|
||||
You can get a long way by understanding a few key terms.
|
||||
|
||||
## Exchanges
|
||||
|
||||
Exchanges are for message **producers**. In Logstash, we map these to
|
||||
**outputs**. Logstash puts messages on exchanges. There are many types of
|
||||
exchanges and they are discussed below.
|
||||
|
||||
## Queues
|
||||
|
||||
Queues are for message **consumers**. In Logstash, we map these to inputs.
|
||||
Logstash reads messages from queues. Optionally, queues can consume only a
|
||||
subset of messages. This is done with "routing keys".
|
||||
|
||||
## Bindings
|
||||
|
||||
Just having a producer and a consumer is not enough. We must `bind` a queue to
|
||||
an exchange. When we bind a queue to an exchange, we can optionally provide a
|
||||
routing key. Routing keys are discussed below.
|
||||
|
||||
## Broker
|
||||
|
||||
A broker is simply the AMQP server software. There are several brokers, but this
|
||||
tutorial will cover the most common (and arguably popular), [RabbitMQ](http://www.rabbitmq.com).
|
||||
|
||||
# Routing Keys
|
||||
|
||||
Simply put, routing keys are somewhat like tags for messages. In practice, they
|
||||
are hierarchical in nature with the each level separated by a dot:
|
||||
|
||||
- `messages.servers.production`
|
||||
- `sports.atlanta.baseball`
|
||||
- `company.myorg.mydepartment`
|
||||
|
||||
Routing keys are really handy with a tool like logstash where you
|
||||
can programatically define the routing key for a given event using the metadata that logstash provides:
|
||||
|
||||
- `logs.servers.production.host1`
|
||||
- `logs.servers.development.host1.syslog`
|
||||
- `logs.servers.application_foo.critical`
|
||||
|
||||
From a consumer/queue perspective, routing keys also support two types wildcards - `#` and `*`.
|
||||
|
||||
- `*` (asterisk) matches any single word.
|
||||
- `#` (hash) matches any number of words and behaves like a traditional wildcard.
|
||||
|
||||
Using the above examples, if you wanted to bind to an exchange and see messages
|
||||
for just production, you would use the routing key `logs.servers.production.*`.
|
||||
If you wanted to see messages for host1, regardless of environment you could
|
||||
use `logs.servers.%.host1.#`.
|
||||
|
||||
Wildcards can be a bit confusing but a good general rule to follow is to use
|
||||
`*` in places where you need wildcards for a known element. Use `#` when you
|
||||
need to match any remaining placeholders. Note that wildcards in routing keys
|
||||
only make sense on the consumer/queue binding, not in the publishing/exchange
|
||||
side.
|
||||
|
||||
We'll get into some of that neat stuff below. For now, it's enough to
|
||||
understand the general idea behind routing keys.
|
||||
|
||||
# Exchange types
|
||||
|
||||
There are three primary types of exchanges that you'll see.
|
||||
|
||||
## Direct
|
||||
|
||||
A direct exchange is one that is probably most familiar to people. Message
|
||||
comes in and, assuming there is a queue bound, the message is picked up. You
|
||||
can have multiple queues bound to the same direct exchange. The best way to
|
||||
understand this pattern is pool of workers (queues) that read from a direct
|
||||
exchange to get units of work. Only one consumer will see a given message in a
|
||||
direct exchange.
|
||||
|
||||
You can set routing keys on messages published to a direct exchange. This
|
||||
allows you do have workers that do different tasks read from the same global
|
||||
pool of messages yet consume only the ones they know how to handle.
|
||||
|
||||
The RabbitMQ concepts guide (linked below) does a good job of describing this
|
||||
visually
|
||||
[here](http://www.rabbitmq.com/img/tutorials/intro/exchange-direct.png)
|
||||
|
||||
## Fanout
|
||||
|
||||
Fanouts are another type of exchange. Unlike direct exchanges, every queue
|
||||
bound to a fanout exchange will see the same messages. This is best described
|
||||
as a PUB/SUB pattern. This is helpful when you need broadcast messages to
|
||||
multiple interested parties.
|
||||
|
||||
Fanout exchanges do NOT support routing keys. All bound queues see all
|
||||
messages.
|
||||
|
||||
## Topic
|
||||
|
||||
Topic exchanges are special type of fanout exchange. Fanout exchanges don't
|
||||
support routing keys. Topic exchanges do support them. Just like a fanout
|
||||
exchange, all bound queues see all messages with the additional filter of the
|
||||
routing key.
|
||||
|
||||
# RabbitMQ in logstash
|
||||
|
||||
As stated earlier, in Logstash, Outputs publish to Exchanges. Inputs read from
|
||||
Queues that are bound to Exchanges. Logstash uses the `bunny` RabbitMQ library for
|
||||
interaction with a broker. Logstash endeavors to expose as much of the
|
||||
configuration for both exchanges and queues. There are many different tunables
|
||||
that you might be concerned with setting - including things like message
|
||||
durability or persistence of declared queues/exchanges. See the relevant input
|
||||
and output documentation for RabbitMQ for a full list of tunables.
|
||||
|
||||
# Sample configurations, tips, tricks and gotchas
|
||||
|
||||
There are several examples in the logstash source directory of RabbitMQ usage,
|
||||
however a few general rules might help eliminate any issues.
|
||||
|
||||
## Check your bindings
|
||||
|
||||
If logstash is publishing the messages and logstash is consuming the messages,
|
||||
the `exchange` value for the input should match the `name` in the output.
|
||||
|
||||
sender agent
|
||||
|
||||
input { stdin { type = "test" } }
|
||||
output {
|
||||
rabbitmq {
|
||||
exchange => "test_exchange"
|
||||
host => "my_rabbitmq_server"
|
||||
exchange_type => "fanout"
|
||||
}
|
||||
}
|
||||
|
||||
receiver agent
|
||||
|
||||
input {
|
||||
rabbitmq {
|
||||
queue => "test_queue"
|
||||
host => "my_rabbitmq_server"
|
||||
exchange => "test_exchange" # This matches the exchange declared above
|
||||
}
|
||||
}
|
||||
output { stdout { debug => true }}
|
||||
|
||||
## Message persistence
|
||||
|
||||
By default, logstash will attempt to ensure that you don't lose any messages.
|
||||
This is reflected in the RabbitMQ default settings as well. However there are
|
||||
cases where you might not want this. A good example is where RabbitMQ is not your
|
||||
primary method of shipping.
|
||||
|
||||
In the following example, we use RabbitMQ as a sniffing interface. Our primary
|
||||
destination is the embedded ElasticSearch instance. We have a secondary RabbitMQ
|
||||
output that we use for duplicating messages. However we disable persistence and
|
||||
durability on this interface so that messages don't pile up waiting for
|
||||
delivery. We only use RabbitMQ when we want to watch messages in realtime.
|
||||
Additionally, we're going to leverage routing keys so that we can optionally
|
||||
filter incoming messages to subsets of hosts. The exercise of getting messages
|
||||
to this logstash agent are left up to the user.
|
||||
|
||||
input {
|
||||
# some input definition here
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { embedded => true }
|
||||
rabbitmq {
|
||||
exchange => "logtail"
|
||||
host => "my_rabbitmq_server"
|
||||
exchange_type => "topic" # We use topic here to enable pub/sub with routing keys
|
||||
key => "logs.%{host}"
|
||||
durable => false # If rabbitmq restarts, the exchange disappears.
|
||||
auto_delete => true # If logstash disconnects, the exchange goes away
|
||||
persistent => false # Messages are not persisted to disk
|
||||
}
|
||||
}
|
||||
|
||||
Now if you want to stream logs in realtime, you can use the programming
|
||||
language of your choice to bind a queue to the `logtail` exchange. If you do
|
||||
not specify a routing key, you will see every message that comes in to
|
||||
logstash. However, you can specify a routing key like `logs.apache1` and see
|
||||
only messages from host `apache1`.
|
||||
|
||||
Note that any logstash variable is valid in the key definition. This allows you
|
||||
to create really complex routing key hierarchies for advanced filtering.
|
||||
|
||||
Note that RabbitMQ has specific rules about durability and persistence matching
|
||||
on both the queue and exchange. You should read the RabbitMQ documentation to
|
||||
make sure you don't crash your RabbitMQ server with messages awaiting someone
|
||||
to pick them up.
|
Before Width: | Height: | Size: 31 KiB |
|
@ -1,84 +0,0 @@
|
|||
---
|
||||
title: Metrics from Logs - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Pull metrics from logs
|
||||
|
||||
Logs are more than just text. How many customers signed up today? How many HTTP
|
||||
errors happened this week? When was your last puppet run?
|
||||
|
||||
Apache logs give you the http response code and bytes sent - that's useful in a
|
||||
graph. Metrics occur in logs so frequently there are piles of tools available to
|
||||
help process them.
|
||||
|
||||
Logstash can help (and even replace some tools you might already be using).
|
||||
|
||||
## Example: Replacing Etsy's Logster
|
||||
|
||||
[Etsy](https://github.com/etsy) has some excellent open source tools. One of
|
||||
them, [logster](https://github.com/etsy/logster), is meant to help you pull
|
||||
metrics from logs and ship them to [graphite](http://graphite.wikidot.com/) so
|
||||
you can make pretty graphs of those metrics.
|
||||
|
||||
One sample logster parser is one that pulls http response codes out of your
|
||||
apache logs: [SampleLogster.py](https://github.com/etsy/logster/blob/master/logster/parsers/SampleLogster.py)
|
||||
|
||||
The above code is roughly 50 lines of python and only solves one specific
|
||||
problem in only apache logs: count http response codes by major number (1xx,
|
||||
2xx, 3xx, etc). To be completely fair, you could shrink the code required for
|
||||
a Logster parser, but size is not strictly the point, here.
|
||||
|
||||
## Keep it simple
|
||||
|
||||
Logstash can do more than the above, simpler, and without much coding skill:
|
||||
|
||||
input {
|
||||
file {
|
||||
path => "/var/log/apache/access.log"
|
||||
type => "apache-access"
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
grok {
|
||||
type => "apache-access"
|
||||
pattern => "%{COMBINEDAPACHELOG}"
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
statsd {
|
||||
# Count one hit every event by response
|
||||
increment => "apache.response.%{response}"
|
||||
}
|
||||
}
|
||||
|
||||
The above uses grok to parse fields out of apache logs and using the statsd
|
||||
output to increment counters based on the response code. Of course, now that we
|
||||
are parsing apache logs fully, we can trivially add additional metrics:
|
||||
|
||||
output {
|
||||
statsd {
|
||||
# Count one hit every event by response
|
||||
increment => "apache.response.%{response}"
|
||||
|
||||
# Use the 'bytes' field from the apache log as the count value.
|
||||
count => [ "apache.bytes", "%{bytes}" ]
|
||||
}
|
||||
}
|
||||
|
||||
Now adding additional metrics is just one more line in your logstash config
|
||||
file. BTW, the 'statsd' output writes to another Etsy tool,
|
||||
[statsd](https://github.com/etsy/statsd), which helps build counters/latency
|
||||
data and ship it to graphite for graphing.
|
||||
|
||||
Using the logstash config above and a bunch of apache access requests, you might end up
|
||||
with a graph that looks like this:
|
||||
|
||||

|
||||
|
||||
The point made above is not "logstash is better than Logster" - the point is
|
||||
that logstash is a general-purpose log management and pipelining tool and that
|
||||
while you can centralize logs with logstash, you can read, modify, and write
|
||||
them to and from just about anywhere.
|
||||
|
|
@ -1,118 +0,0 @@
|
|||
---
|
||||
title: ZeroMQ - logstash
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
*ZeroMQ support in Logstash is currently in an experimental phase. As such, parts of this document are subject to change.*
|
||||
|
||||
# ZeroMQ
|
||||
Simply put ZeroMQ (0mq) is a socket on steroids. This makes it a perfect compliment to Logstash - a pipe on steroids.
|
||||
|
||||
ZeroMQ allows you to easily create sockets of various types for moving data around. These sockets are refered to in ZeroMQ by the behavior of each side of the socket pair:
|
||||
|
||||
* PUSH/PULL
|
||||
* REQ/REP
|
||||
* PUB/SUB
|
||||
* ROUTER/DEALER
|
||||
|
||||
There is also a `PAIR` socket type as well.
|
||||
|
||||
Additionally, the socket type is independent of the connection method. A PUB/SUB socket pair could have the SUB side of the socket be a listener and the PUB side a connecting client. This makes it very easy to fit ZeroMQ into various firewalled architectures.
|
||||
|
||||
Note that this is not a full-fledged tutorial on ZeroMQ. It is a tutorial on how Logstash uses ZeroMQ.
|
||||
|
||||
# ZeroMQ and logstash
|
||||
In the spirit of ZeroMQ, Logstash takes these socket type pairs and uses them to create topologies with some very simply rules that make usage very easy to understand:
|
||||
|
||||
* The receiving end of a socket pair is always a logstash input
|
||||
* The sending end of a socket pair is always a logstash output
|
||||
* By default, inputs `bind`/listen and outputs `connect`
|
||||
* Logstash refers to the socket pairs as topologies and mirrors the naming scheme from ZeroMQ
|
||||
* By default, ZeroMQ inputs listen on all interfaces on port 2120, ZeroMQ outputs connect to `localhost` on port 2120
|
||||
|
||||
The currently understood Logstash topologies for ZeroMQ inputs and outputs are:
|
||||
|
||||
* `pushpull`
|
||||
* `pubsub`
|
||||
* `pair`
|
||||
|
||||
We have found from various discussions that these three topologies will cover most of user's needs. We hope to expose the full span of ZeroMQ socket types as time goes on.
|
||||
|
||||
By keeping the options simple, this allows you to get started VERY easily with what are normally complex message flows. No more confusion over `exchanges` and `queues` and `brokers`. If you need to add fanout capability to your flow, you can simply use the following configs:
|
||||
|
||||
* _node agent lives at 192.168.1.2_
|
||||
* _indexer agent lives at 192.168.1.1_
|
||||
|
||||
# Node agent config
|
||||
input { stdin { type => "test-stdin-input" } }
|
||||
output { zeromq { topology => "pubsub" address => "tcp://192.168.1.1.:2120" } }
|
||||
|
||||
# Indexer agent config
|
||||
input { zeromq { topology => "pubsub" } }
|
||||
output { stdout { debug => true }}
|
||||
|
||||
If for some reason you need connections to initiate from the indexer because of firewall rules:
|
||||
|
||||
# Node agent config - now listening on all interfaces port 2120
|
||||
input { stdin { type => "test-stdin-input" } }
|
||||
output { zeromq { topology => "pubsub" address => "tcp://*.:2120" mode => "server" } }
|
||||
|
||||
# Indexer agent config
|
||||
input { zeromq { topology => "pubsub" address => "tcp://192.168.1.2" mode => "client" } }
|
||||
output { stdout { debug => true }}
|
||||
|
||||
As stated above, by default `inputs` always start as listeners and `outputs` always start as initiators. Please don't confuse what happens once the socket is connect with the direction of the connection. ZeroMQ separates connection from topology. In the second case of the above configs, once the two sockets are connected, regardless of who initiated the connection, the message flow itself is absolute. The indexer is reading events from the node.
|
||||
|
||||
# Which topology to use
|
||||
The choice of topology can be broken down very easily based on need
|
||||
|
||||
## one to one
|
||||
Use `pair` topology. On the output side, specify the ipaddress and port of the input side.
|
||||
|
||||
## broadcast
|
||||
Use `pubsub`
|
||||
If you need to broadcast ALL messages to multiple hosts that each need to see all events, use `pubsub`. Note that all events are broadcast to all subscribers. When using `pubsub` you might also want to investigate the `topic` configuration option which allows subscribers to see only a subset of messages.
|
||||
|
||||
## Filter workers
|
||||
Use `pushpull`
|
||||
In `pushpull`, ZeroMQ automatically load balances to all connected peers. This means that no peer sees the same message as any other peer.
|
||||
|
||||
# What's with the address format?
|
||||
ZeroMQ supports multiple types of transports:
|
||||
|
||||
* inproc:// (unsupported by logstash due to threading)
|
||||
* tcp:// (exactly what it sounds like)
|
||||
* ipc:// (probably useless in logstash)
|
||||
* pgm:// and epgm:// (a multicast format - only usable with PUB and SUB socket types)
|
||||
|
||||
For pretty much all cases, you'll be using `tcp://` transports with Logstash.
|
||||
|
||||
## Topic - applies to `pubsub`
|
||||
This opt mimics the routing keys functionality in AMQP. Imagine you have a network of receivers but only a subset of the messages need to be seen by a subset of the hosts. You can use this option as a routing key to facilite that:
|
||||
|
||||
# This output is a PUB
|
||||
output {
|
||||
zeromq { topology => "pubsub" topic => "logs.production.%{host}" }
|
||||
}
|
||||
|
||||
# This input is a SUB
|
||||
# I only care about db1 logs
|
||||
input { zeromq { type => "db1logs" address => "tcp://<ipaddress>:2120" topic => "logs.production.db1"}}
|
||||
|
||||
One thing important to note about 0mq PUBSUB and topics is that all filtering is done on the subscriber side. The subscriber will get ALL messages but discard any that don't match the topic.
|
||||
|
||||
Also important to note is that 0mq doesn't do topic in the same sense as an AMQP broker might. When a SUB socket gets a message, it compares the first bytes of the message against the topic. However, this isn't always flexible depending on the format of your message. The common practice then, is to send a 0mq multipart message and make the first part the topic. The next parts become the actual message body.
|
||||
|
||||
This is approach is how logstash handles this. When using PUBSUB, Logstash will send a multipart message where the first part is the name of the topic and the second part is the event. This is important to know if you are sending to a SUB input from sources other than Logstash.
|
||||
|
||||
# sockopts
|
||||
Sockopts is not you choosing between blue or black socks. ZeroMQ supports setting various flags or options on sockets. In the interest of minimizing configuration syntax, these are _hidden_ behind a logstash configuration element called `sockopts`. You probably won't need to tune these for most cases. If you do need to tune them, you'll probably set the following:
|
||||
|
||||
## ZMQ::HWM - sets the high water mark
|
||||
The high water mark is the maximum number of messages a given socket pair can have in its internal queue. Use this to throttle essentially.
|
||||
|
||||
## ZMQ::SWAP_SIZE
|
||||
TODO
|
||||
|
||||
## ZMQ::IDENTITY
|
||||
TODO
|