Cleanup docs directory
Remove old, unused markdown docs Bring dir structure to mirror logstash-docs repo
3
.gitignore
vendored
|
@ -26,3 +26,6 @@ spec/reports
|
|||
rspec.xml
|
||||
.install-done
|
||||
.vendor
|
||||
integration_run
|
||||
.mvn/
|
||||
|
||||
|
|
|
@ -1,322 +0,0 @@
|
|||
---
|
||||
title: Configuration Language - Logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Logstash Config Language
|
||||
|
||||
The Logstash config language aims to be simple.
|
||||
|
||||
There are 3 main sections: inputs, filters, outputs. Each section has
|
||||
configurations for each plugin available in that section.
|
||||
|
||||
Example:
|
||||
|
||||
# This is a comment. You should use comments to describe
|
||||
# parts of your configuration.
|
||||
input {
|
||||
...
|
||||
}
|
||||
|
||||
filter {
|
||||
...
|
||||
}
|
||||
|
||||
output {
|
||||
...
|
||||
}
|
||||
|
||||
## Filters and Ordering
|
||||
|
||||
For a given event, are applied in the order of appearance in the
|
||||
configuration file.
|
||||
|
||||
## Comments
|
||||
|
||||
Comments are the same as in ruby, perl, and python. Starts with a '#' character.
|
||||
Example:
|
||||
|
||||
# this is a comment
|
||||
|
||||
input { # comments can appear at the end of a line, too
|
||||
# ...
|
||||
}
|
||||
|
||||
## Plugins
|
||||
|
||||
The input, filter and output sections all let you configure plugins. Plugin
|
||||
configuration consists of the plugin name followed by a block of settings for
|
||||
that plugin. For example, how about two file inputs:
|
||||
|
||||
input {
|
||||
file {
|
||||
path => "/var/log/messages"
|
||||
type => "syslog"
|
||||
}
|
||||
|
||||
file {
|
||||
path => "/var/log/apache/access.log"
|
||||
type => "apache"
|
||||
}
|
||||
}
|
||||
|
||||
The above configures two file separate inputs. Both set two
|
||||
configuration settings each: 'path' and 'type'. Each plugin has different
|
||||
settings for configuring it; seek the documentation for your plugin to
|
||||
learn what settings are available and what they mean. For example, the
|
||||
[file input][fileinput] documentation will explain the meanings of the
|
||||
path and type settings.
|
||||
|
||||
[fileinput]: inputs/file
|
||||
|
||||
## Value Types
|
||||
|
||||
The documentation for a plugin may enforce a configuration field having a
|
||||
certain type. Examples include boolean, string, array, number, hash,
|
||||
etc.
|
||||
|
||||
### <a name="boolean"></a>Boolean
|
||||
|
||||
A boolean must be either `true` or `false`. Note the lack of quotes around
|
||||
`true` and `false`.
|
||||
|
||||
Examples:
|
||||
|
||||
debug => true
|
||||
|
||||
### <a name="string"></a>String
|
||||
|
||||
A string must be a single value.
|
||||
|
||||
Example:
|
||||
|
||||
name => "Hello world"
|
||||
|
||||
Single, unquoted words are valid as strings, too, but you should use quotes.
|
||||
|
||||
### <a name="number"></a>Number
|
||||
|
||||
Numbers must be valid numerics (floating point or integer are OK).
|
||||
|
||||
Example:
|
||||
|
||||
port => 33
|
||||
|
||||
### <a name="array"></a>Array
|
||||
|
||||
An array can be a single string value or multiple. If you specify the same
|
||||
field multiple times, it appends to the array.
|
||||
|
||||
Examples:
|
||||
|
||||
path => [ "/var/log/messages", "/var/log/*.log" ]
|
||||
path => "/data/mysql/mysql.log"
|
||||
|
||||
The above makes 'path' a 3-element array including all 3 strings.
|
||||
|
||||
### <a name="hash"></a>Hash
|
||||
|
||||
A hash is basically the same syntax as Ruby hashes.
|
||||
The key and value are simply pairs, such as:
|
||||
|
||||
match => {
|
||||
"field1" => "value1"
|
||||
"field2" => "value2"
|
||||
...
|
||||
}
|
||||
|
||||
## <a name="eventdependent"></a>Event Dependent Configuration
|
||||
|
||||
The logstash agent is a processing pipeline with 3 stages: inputs -> filters ->
|
||||
outputs. Inputs generate events, filters modify them, outputs ship them
|
||||
elsewhere.
|
||||
|
||||
All events have properties. For example, an apache access log would have things
|
||||
like status code (200, 404), request path ("/", "index.html"), HTTP verb
|
||||
(GET, POST), client IP address, etc. Logstash calls these properties "fields."
|
||||
|
||||
Some of the configuration options in Logstash require the existence of fields in
|
||||
order to function. Because inputs generate events, there are no fields to
|
||||
evaluate within the input block--they do not exist yet!
|
||||
|
||||
Because of their dependency on events and fields, the following configuration
|
||||
options will only work within filter and output blocks.
|
||||
|
||||
**IMPORTANT: Field references, sprintf format and conditionals, described below,
|
||||
will not work in an input block.
|
||||
|
||||
### <a name="fieldreferences"></a>Field References
|
||||
|
||||
In many cases, it is useful to be able to refer to a field by name. To do this,
|
||||
you can use the Logstash field reference syntax.
|
||||
|
||||
By way of example, let us suppose we have this event:
|
||||
|
||||
{
|
||||
"agent": "Mozilla/5.0 (compatible; MSIE 9.0)",
|
||||
"ip": "192.168.24.44",
|
||||
"request": "/index.html"
|
||||
"response": {
|
||||
"status": 200,
|
||||
"bytes": 52353
|
||||
},
|
||||
"ua": {
|
||||
"os": "Windows 7"
|
||||
}
|
||||
}
|
||||
|
||||
- the syntax to access fields is `[fieldname]`.
|
||||
- if you are only referring to a **top-level field**, you can omit the `[]` and
|
||||
simply say `fieldname`.
|
||||
- in the case of **nested fields**, like the "os" field above, you need
|
||||
the full path to that field: `[ua][os]`.
|
||||
|
||||
### <a name="sprintf"></a>sprintf format
|
||||
|
||||
This syntax is also used in what Logstash calls 'sprintf format'. This format
|
||||
allows you to refer to field values from within other strings. For example, the
|
||||
statsd output has an 'increment' setting, to allow you to keep a count of
|
||||
apache logs by status code:
|
||||
|
||||
output {
|
||||
statsd {
|
||||
increment => "apache.%{[response][status]}"
|
||||
}
|
||||
}
|
||||
|
||||
You can also do time formatting in this sprintf format. Instead of specifying a
|
||||
field name, use the `+FORMAT` syntax where `FORMAT` is a
|
||||
[time format](http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html).
|
||||
|
||||
For example, if you want to use the file output to write to logs based on the
|
||||
hour and the 'type' field:
|
||||
|
||||
output {
|
||||
file {
|
||||
path => "/var/log/%{type}.%{+yyyy.MM.dd.HH}"
|
||||
}
|
||||
}
|
||||
|
||||
### <a name="conditionals"></a>Conditionals
|
||||
|
||||
Sometimes you only want a filter or output to process an event under
|
||||
certain conditions. For that, you'll want to use a conditional!
|
||||
|
||||
Conditionals in Logstash look and act the same way they do in programming
|
||||
languages. You have `if`, `else if` and `else` statements. Conditionals may be
|
||||
nested if you need that.
|
||||
|
||||
The syntax is follows:
|
||||
|
||||
if EXPRESSION {
|
||||
...
|
||||
} else if EXPRESSION {
|
||||
...
|
||||
} else {
|
||||
...
|
||||
}
|
||||
|
||||
What's an expression? Comparison tests, boolean logic, etc!
|
||||
|
||||
The following comparison operators are supported:
|
||||
|
||||
* equality, etc: ==, !=, <, >, <=, >=
|
||||
* regexp: =~, !~
|
||||
* inclusion: in, not in
|
||||
|
||||
The following boolean operators are supported:
|
||||
|
||||
* and, or, nand, xor
|
||||
|
||||
The following unary operators are supported:
|
||||
|
||||
* !
|
||||
|
||||
Expressions may contain expressions. Expressions may be negated with `!`.
|
||||
Expressions may be grouped with parentheses `(...)`. Expressions can be long
|
||||
and complex.
|
||||
|
||||
For example, if we want to remove the field `secret` if the field
|
||||
`action` has a value of `login`:
|
||||
|
||||
filter {
|
||||
if [action] == "login" {
|
||||
mutate { remove => "secret" }
|
||||
}
|
||||
}
|
||||
|
||||
The above uses the field reference syntax to get the value of the
|
||||
`action` field. It is compared against the text `login` and, if equal,
|
||||
allows the mutate filter to delete the field named `secret`.
|
||||
|
||||
How about a more complex example?
|
||||
|
||||
* alert nagios of any apache events with status 5xx
|
||||
* record any 4xx status to elasticsearch
|
||||
* record all status code hits via statsd
|
||||
|
||||
How about telling nagios of any http event that has a status code of 5xx?
|
||||
|
||||
output {
|
||||
if [type] == "apache" {
|
||||
if [status] =~ /^5\d\d/ {
|
||||
nagios { ... }
|
||||
} else if [status] =~ /^4\d\d/ {
|
||||
elasticsearch { ... }
|
||||
}
|
||||
|
||||
statsd { increment => "apache.%{status}" }
|
||||
}
|
||||
}
|
||||
|
||||
You can also do multiple expressions in a single condition:
|
||||
|
||||
output {
|
||||
# Send production errors to pagerduty
|
||||
if [loglevel] == "ERROR" and [deployment] == "production" {
|
||||
pagerduty {
|
||||
...
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
You can test whether a field was present, regardless of its value:
|
||||
|
||||
if [exception_message] {
|
||||
# If the event has an exception_message field, set the level
|
||||
mutate { add_field => { "level" => "ERROR" } }
|
||||
}
|
||||
|
||||
Here are some examples for testing with the in conditional:
|
||||
|
||||
filter {
|
||||
if [foo] in [foobar] {
|
||||
mutate { add_tag => "field in field" }
|
||||
}
|
||||
if [foo] in "foo" {
|
||||
mutate { add_tag => "field in string" }
|
||||
}
|
||||
if "hello" in [greeting] {
|
||||
mutate { add_tag => "string in field" }
|
||||
}
|
||||
if [foo] in ["hello", "world", "foo"] {
|
||||
mutate { add_tag => "field in list" }
|
||||
}
|
||||
if [missing] in [alsomissing] {
|
||||
mutate { add_tag => "shouldnotexist" }
|
||||
}
|
||||
if !("foo" in ["hello", "world"]) {
|
||||
mutate { add_tag => "shouldexist" }
|
||||
}
|
||||
}
|
||||
|
||||
Or, to test if grok was successful:
|
||||
|
||||
output {
|
||||
if "_grokparsefailure" not in [tags] {
|
||||
elasticsearch { ... }
|
||||
}
|
||||
}
|
||||
|
||||
## Further Reading
|
||||
|
||||
For more information, see [the plugin docs index](index)
|
|
@ -1,59 +0,0 @@
|
|||
---
|
||||
title: Logstash Contrib plugins
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
# contrib plugins
|
||||
|
||||
As logstash has grown, we've accumulated a massive repository of plugins. Well
|
||||
over 100 plugins, it became difficult for the project maintainers to adequately
|
||||
support everything effectively.
|
||||
|
||||
In order to improve the quality of popular plugins, we've moved the
|
||||
less-commonly-used plugins to a separate repository we're calling "contrib".
|
||||
Concentrating common plugin usage into core solves a few problems, most notably
|
||||
user complaints about the size of logstash releases, support/maintenance costs,
|
||||
etc.
|
||||
|
||||
It is our intent that this separation will improve life for users. If it
|
||||
doesn't, please file a bug so we can work to address it!
|
||||
|
||||
If a plugin is available in the 'contrib' package, the documentation for that
|
||||
plugin will note this boldly at the top of that plugin's documentation.
|
||||
|
||||
Contrib plugins reside in a [separate github project](https://github.com/elasticsearch/logstash-contrib).
|
||||
|
||||
# Packaging
|
||||
|
||||
At present, the contrib modules are available as a tarball.
|
||||
|
||||
# Automated Installation
|
||||
|
||||
The `bin/plugin` script will handle the installation for you:
|
||||
|
||||
cd /path/to/logstash
|
||||
bin/plugin install contrib
|
||||
|
||||
# Manual Installation
|
||||
|
||||
The contrib plugins can be extracted on top of an existing Logstash installation.
|
||||
|
||||
For example, if I've extracted `logstash-%VERSION%.tar.gz` into `/path`, e.g.
|
||||
|
||||
cd /path
|
||||
tar zxf ~/logstash-%VERSION%.tar.gz
|
||||
|
||||
It will have a `/path/logstash-%VERSION%` directory, e.g.
|
||||
|
||||
$ ls
|
||||
logstash-%VERSION%
|
||||
|
||||
The method to install the contrib tarball is identical.
|
||||
|
||||
cd /path
|
||||
wget http://download.elasticsearch.org/logstash/logstash/logstash-contrib-%VERSION%.tar.gz
|
||||
tar zxf ~/logstash-contrib-%VERSION%.tar.gz
|
||||
|
||||
This will install the contrib plugins in the same directory as the core
|
||||
install. These plugins will be available to logstash the next time it starts.
|
||||
|
250
docs/docgen.rb
|
@ -1,250 +0,0 @@
|
|||
require "rubygems"
|
||||
require "erb"
|
||||
require "optparse"
|
||||
require "kramdown" # markdown parser
|
||||
|
||||
$: << Dir.pwd
|
||||
$: << File.join(File.dirname(__FILE__), "..", "lib")
|
||||
|
||||
require "logstash/config/mixin"
|
||||
require "logstash/inputs/base"
|
||||
require "logstash/codecs/base"
|
||||
require "logstash/filters/base"
|
||||
require "logstash/outputs/base"
|
||||
require "logstash/version"
|
||||
|
||||
class LogStashConfigDocGenerator
|
||||
COMMENT_RE = /^ *#(?: (.*)| *$)/
|
||||
|
||||
def initialize
|
||||
@rules = {
|
||||
COMMENT_RE => lambda { |m| add_comment(m[1]) },
|
||||
/^ *class.*< *LogStash::(Outputs|Filters|Inputs|Codecs)::(Base|Threadable)/ => \
|
||||
lambda { |m| set_class_description },
|
||||
/^ *config +[^=].*/ => lambda { |m| add_config(m[0]) },
|
||||
/^ *milestone .*/ => lambda { |m| set_milestone(m[0]) },
|
||||
/^ *config_name .*/ => lambda { |m| set_config_name(m[0]) },
|
||||
/^ *flag[( ].*/ => lambda { |m| add_flag(m[0]) },
|
||||
/^ *(class|def|module) / => lambda { |m| clear_comments },
|
||||
}
|
||||
|
||||
if File.exists?("build/contrib_plugins")
|
||||
@contrib_list = File.read("build/contrib_plugins").split("\n")
|
||||
else
|
||||
@contrib_list = []
|
||||
end
|
||||
end
|
||||
|
||||
def parse(string)
|
||||
clear_comments
|
||||
buffer = ""
|
||||
string.split(/\r\n|\n/).each do |line|
|
||||
# Join long lines
|
||||
if line =~ COMMENT_RE
|
||||
# nothing
|
||||
else
|
||||
# Join extended lines
|
||||
if line =~ /(, *$)|(\\$)|(\[ *$)/
|
||||
buffer += line.gsub(/\\$/, "")
|
||||
next
|
||||
end
|
||||
end
|
||||
|
||||
line = buffer + line
|
||||
buffer = ""
|
||||
|
||||
@rules.each do |re, action|
|
||||
m = re.match(line)
|
||||
if m
|
||||
action.call(m)
|
||||
end
|
||||
end # RULES.each
|
||||
end # string.split("\n").each
|
||||
end # def parse
|
||||
|
||||
def set_class_description
|
||||
@class_description = @comments.join("\n")
|
||||
clear_comments
|
||||
end # def set_class_description
|
||||
|
||||
def add_comment(comment)
|
||||
return if comment == "encoding: utf-8"
|
||||
@comments << comment
|
||||
end # def add_comment
|
||||
|
||||
def add_config(code)
|
||||
# I just care about the 'config :name' part
|
||||
code = code.sub(/,.*/, "")
|
||||
|
||||
# call the code, which calls 'config' in this class.
|
||||
# This will let us align comments with config options.
|
||||
name, opts = eval(code)
|
||||
|
||||
# TODO(sissel): This hack is only required until regexp configs
|
||||
# are gone from logstash.
|
||||
name = name.to_s unless name.is_a?(Regexp)
|
||||
|
||||
description = Kramdown::Document.new(@comments.join("\n")).to_html
|
||||
@attributes[name][:description] = description
|
||||
clear_comments
|
||||
end # def add_config
|
||||
|
||||
def add_flag(code)
|
||||
# call the code, which calls 'config' in this class.
|
||||
# This will let us align comments with config options.
|
||||
#p :code => code
|
||||
fixed_code = code.gsub(/ do .*/, "")
|
||||
#p :fixedcode => fixed_code
|
||||
name, description = eval(fixed_code)
|
||||
@flags[name] = description
|
||||
clear_comments
|
||||
end # def add_flag
|
||||
|
||||
def set_config_name(code)
|
||||
name = eval(code)
|
||||
@name = name
|
||||
end # def set_config_name
|
||||
|
||||
def set_milestone(code)
|
||||
@milestone = eval(code)
|
||||
end
|
||||
|
||||
# pretend to be the config DSL and just get the name
|
||||
def config(name, opts={})
|
||||
return name, opts
|
||||
end # def config
|
||||
|
||||
# Pretend to support the flag DSL
|
||||
def flag(*args, &block)
|
||||
name = args.first
|
||||
description = args.last
|
||||
return name, description
|
||||
end # def config
|
||||
|
||||
# pretend to be the config dsl's 'config_name' method
|
||||
def config_name(name)
|
||||
return name
|
||||
end # def config_name
|
||||
|
||||
# pretend to be the config dsl's 'milestone' method
|
||||
def milestone(m)
|
||||
return m
|
||||
end # def milestone
|
||||
|
||||
def clear_comments
|
||||
@comments.clear
|
||||
end # def clear_comments
|
||||
|
||||
def generate(file, settings)
|
||||
@class_description = ""
|
||||
@milestone = ""
|
||||
@comments = []
|
||||
@attributes = Hash.new { |h,k| h[k] = {} }
|
||||
@flags = {}
|
||||
|
||||
# local scoping for the monkeypatch belowg
|
||||
attributes = @attributes
|
||||
# Monkeypatch the 'config' method to capture
|
||||
# Note, this monkeypatch requires us do the config processing
|
||||
# one at a time.
|
||||
#LogStash::Config::Mixin::DSL.instance_eval do
|
||||
#define_method(:config) do |name, opts={}|
|
||||
#p name => opts
|
||||
#attributes[name].merge!(opts)
|
||||
#end
|
||||
#end
|
||||
|
||||
# Loading the file will trigger the config dsl which should
|
||||
# collect all the config settings.
|
||||
load file
|
||||
|
||||
# parse base first
|
||||
parse(File.new(File.join(File.dirname(file), "base.rb"), "r").read)
|
||||
|
||||
# Now parse the real library
|
||||
code = File.new(file).read
|
||||
|
||||
# inputs either inherit from Base or Threadable.
|
||||
if code =~ /\< LogStash::Inputs::Threadable/
|
||||
parse(File.new(File.join(File.dirname(file), "threadable.rb"), "r").read)
|
||||
end
|
||||
|
||||
if code =~ /include LogStash::PluginMixins/
|
||||
mixin = code.gsub(/.*include LogStash::PluginMixins::(\w+)\s.*/m, '\1')
|
||||
mixin.gsub!(/(.)([A-Z])/, '\1_\2')
|
||||
mixin.downcase!
|
||||
parse(File.new(File.join(File.dirname(file), "..", "plugin_mixins", "#{mixin}.rb")).read)
|
||||
end
|
||||
|
||||
parse(code)
|
||||
|
||||
puts "Generating docs for #{file}"
|
||||
|
||||
if @name.nil?
|
||||
$stderr.puts "Missing 'config_name' setting in #{file}?"
|
||||
return nil
|
||||
end
|
||||
|
||||
klass = LogStash::Config::Registry.registry[@name]
|
||||
if klass.ancestors.include?(LogStash::Inputs::Base)
|
||||
section = "input"
|
||||
elsif klass.ancestors.include?(LogStash::Filters::Base)
|
||||
section = "filter"
|
||||
elsif klass.ancestors.include?(LogStash::Outputs::Base)
|
||||
section = "output"
|
||||
elsif klass.ancestors.include?(LogStash::Codecs::Base)
|
||||
section = "codec"
|
||||
end
|
||||
|
||||
template_file = File.join(File.dirname(__FILE__), "plugin-doc.html.erb")
|
||||
template = ERB.new(File.new(template_file).read, nil, "-")
|
||||
|
||||
is_contrib_plugin = @contrib_list.include?(file)
|
||||
|
||||
# descriptions are assumed to be markdown
|
||||
description = Kramdown::Document.new(@class_description).to_html
|
||||
|
||||
klass.get_config.each do |name, settings|
|
||||
@attributes[name].merge!(settings)
|
||||
end
|
||||
sorted_attributes = @attributes.sort { |a,b| a.first.to_s <=> b.first.to_s }
|
||||
klassname = LogStash::Config::Registry.registry[@name].to_s
|
||||
name = @name
|
||||
|
||||
synopsis_file = File.join(File.dirname(__FILE__), "plugin-synopsis.html.erb")
|
||||
synopsis = ERB.new(File.new(synopsis_file).read, nil, "-").result(binding)
|
||||
|
||||
if settings[:output]
|
||||
dir = File.join(settings[:output], section + "s")
|
||||
path = File.join(dir, "#{name}.html")
|
||||
Dir.mkdir(settings[:output]) if !File.directory?(settings[:output])
|
||||
Dir.mkdir(dir) if !File.directory?(dir)
|
||||
File.open(path, "w") do |out|
|
||||
html = template.result(binding)
|
||||
html.gsub!("%VERSION%", LOGSTASH_VERSION)
|
||||
html.gsub!("%PLUGIN%", @name)
|
||||
out.puts(html)
|
||||
end
|
||||
else
|
||||
puts template.result(binding)
|
||||
end
|
||||
end # def generate
|
||||
|
||||
end # class LogStashConfigDocGenerator
|
||||
|
||||
if __FILE__ == $0
|
||||
opts = OptionParser.new
|
||||
settings = {}
|
||||
opts.on("-o DIR", "--output DIR",
|
||||
"Directory to output to; optional. If not specified,"\
|
||||
"we write to stdout.") do |val|
|
||||
settings[:output] = val
|
||||
end
|
||||
|
||||
args = opts.parse(ARGV)
|
||||
|
||||
args.each do |arg|
|
||||
gen = LogStashConfigDocGenerator.new
|
||||
gen.generate(arg, settings)
|
||||
end
|
||||
end
|
|
@ -1,108 +0,0 @@
|
|||
---
|
||||
title: How to extend - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Add a new filter
|
||||
|
||||
This document shows you how to add a new filter to logstash.
|
||||
|
||||
For a general overview of how to add a new plugin, see [the extending
|
||||
logstash](.) overview.
|
||||
|
||||
## Write code.
|
||||
|
||||
Let's write a 'hello world' filter. This filter will replace the 'message' in
|
||||
the event with "Hello world!"
|
||||
|
||||
First, logstash expects plugins in a certain directory structure: `logstash/TYPE/PLUGIN_NAME.rb`
|
||||
|
||||
Since we're creating a filter, let's mkdir this:
|
||||
|
||||
mkdir -p logstash/filters/
|
||||
cd logstash/filters
|
||||
|
||||
Now add the code:
|
||||
|
||||
# Call this file 'foo.rb' (in logstash/filters, as above)
|
||||
require "logstash/filters/base"
|
||||
require "logstash/namespace"
|
||||
|
||||
class LogStash::Filters::Foo < LogStash::Filters::Base
|
||||
|
||||
# Setting the config_name here is required. This is how you
|
||||
# configure this filter from your logstash config.
|
||||
#
|
||||
# filter {
|
||||
# foo { ... }
|
||||
# }
|
||||
config_name "foo"
|
||||
|
||||
# New plugins should start life at milestone 1.
|
||||
milestone 1
|
||||
|
||||
# Replace the message with this value.
|
||||
config :message, :validate => :string
|
||||
|
||||
public
|
||||
def register
|
||||
# nothing to do
|
||||
end # def register
|
||||
|
||||
public
|
||||
def filter(event)
|
||||
# return nothing unless there's an actual filter event
|
||||
return unless filter?(event)
|
||||
if @message
|
||||
# Replace the event message with our message as configured in the
|
||||
# config file.
|
||||
event["message"] = @message
|
||||
end
|
||||
# filter_matched should go in the last line of our successful code
|
||||
filter_matched(event)
|
||||
end # def filter
|
||||
end # class LogStash::Filters::Foo
|
||||
|
||||
## Add it to your configuration
|
||||
|
||||
For this simple example, let's just use stdin input and stdout output.
|
||||
The config file looks like this:
|
||||
|
||||
input {
|
||||
stdin { type => "foo" }
|
||||
}
|
||||
filter {
|
||||
if [type] == "foo" {
|
||||
foo {
|
||||
message => "Hello world!"
|
||||
}
|
||||
}
|
||||
}
|
||||
output {
|
||||
stdout { }
|
||||
}
|
||||
|
||||
Call this file 'example.conf'
|
||||
|
||||
## Tell logstash about it.
|
||||
|
||||
Depending on how you installed logstash, you have a few ways of including this
|
||||
plugin.
|
||||
|
||||
You can use the agent flag --pluginpath flag to specify where the root of your
|
||||
plugin tree is. In our case, it's the current directory.
|
||||
|
||||
% bin/logstash --pluginpath your/plugin/root -f example.conf
|
||||
|
||||
## Example running
|
||||
|
||||
In the example below, I typed in "the quick brown fox" after running the java
|
||||
command.
|
||||
|
||||
% bin/logstash -f example.conf
|
||||
the quick brown fox
|
||||
2011-05-12T01:05:09.495000Z stdin://snack.home/: Hello world!
|
||||
|
||||
The output is the standard logstash stdout output, but in this case our "the
|
||||
quick brown fox" message was replaced with "Hello world!"
|
||||
|
||||
All done! :)
|
|
@ -1,91 +0,0 @@
|
|||
---
|
||||
title: How to extend - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Extending logstash
|
||||
|
||||
You can add your own input, output, or filter plugins to logstash.
|
||||
|
||||
If you're looking to extend logstash today, please look at the existing plugins.
|
||||
|
||||
## Good examples of plugins
|
||||
|
||||
* [inputs/tcp](https://github.com/logstash/logstash/blob/master/lib/logstash/inputs/tcp.rb)
|
||||
* [filters/multiline](https://github.com/logstash/logstash/blob/master/lib/logstash/filters/multiline.rb)
|
||||
* [outputs/mongodb](https://github.com/logstash/logstash/blob/master/lib/logstash/outputs/mongodb.rb)
|
||||
|
||||
## Common concepts
|
||||
|
||||
* The `config_name` sets the name used in the config file.
|
||||
* The `milestone` sets the milestone number of the plugin. See <../plugin-milestones> for more info.
|
||||
* The `config` lines define config options.
|
||||
* The `register` method is called per plugin instantiation. Do any of your initialization here.
|
||||
|
||||
### Required modules
|
||||
|
||||
All plugins should require the Logstash module.
|
||||
|
||||
require 'logstash/namespace'
|
||||
|
||||
### Plugin name
|
||||
|
||||
Every plugin must have a name set with the `config_name` method. If this
|
||||
is not specified plugins will fail to load with an error.
|
||||
|
||||
### Milestones
|
||||
|
||||
Every plugin needs a milestone set using `milestone`. See
|
||||
<../plugin-milestones> for more info.
|
||||
|
||||
### Config lines
|
||||
|
||||
The `config` lines define configuration options and are constructed like
|
||||
so:
|
||||
|
||||
config :host, :validate => :string, :default => "0.0.0.0"
|
||||
|
||||
The name of the option is specified, here `:host` and then the
|
||||
attributes of the option. They can include `:validate`, `:default`,
|
||||
`:required` (a Boolean `true` or `false`), `:deprecated` (also a
|
||||
Boolean), and `:obsolete` (a String value).
|
||||
|
||||
## Inputs
|
||||
|
||||
All inputs require the LogStash::Inputs::Base class:
|
||||
|
||||
require 'logstash/inputs/base'
|
||||
|
||||
Inputs have two methods: `register` and `run`.
|
||||
|
||||
* Each input runs as its own thread.
|
||||
* The `run` method is expected to run-forever.
|
||||
|
||||
## Filters
|
||||
|
||||
All filters require the LogStash::Filters::Base class:
|
||||
|
||||
require 'logstash/filters/base'
|
||||
|
||||
Filters have two methods: `register` and `filter`.
|
||||
|
||||
* The `filter` method gets an event.
|
||||
* Call `event.cancel` to drop the event.
|
||||
* To modify an event, simply make changes to the event you are given.
|
||||
* The return value is ignored.
|
||||
|
||||
## Outputs
|
||||
|
||||
All outputs require the LogStash::Outputs::Base class:
|
||||
|
||||
require 'logstash/outputs/base'
|
||||
|
||||
Outputs have two methods: `register` and `receive`.
|
||||
|
||||
* The `register` method is called per plugin instantiation. Do any of your initialization here.
|
||||
* The `receive` method is called when an event gets pushed to your output
|
||||
|
||||
## Example: a new filter
|
||||
|
||||
Learn by example how to [add a new filter to logstash](example-add-a-new-filter)
|
||||
|
||||
|
|
@ -1,45 +0,0 @@
|
|||
---
|
||||
title: Command-line flags - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Command-line flags
|
||||
|
||||
## Agent
|
||||
|
||||
The logstash agent has the following flags (also try using the '--help' flag)
|
||||
|
||||
<dl>
|
||||
<dt> -f, --config CONFIGFILE </dt>
|
||||
<dd> Load the logstash config from a specific file, directory, or a
|
||||
wildcard. If given a directory or wildcard, config files will be read
|
||||
from the directory in alphabetical order. </dd>
|
||||
<dt> -e CONFIGSTRING </dt>
|
||||
<dd> Use the given string as the configuration data. Same syntax as the
|
||||
config file. If not input is specified, 'stdin { type => stdin }' is
|
||||
default. If no output is specified, 'stdout { debug => true }}' is
|
||||
default. </dd>
|
||||
<dt> -w, --filterworkers COUNT </dt>
|
||||
<dd> Run COUNT filter workers (default: 1) </dd>
|
||||
<dt> -l, --log FILE </dt>
|
||||
<dd> Log to a given path. Default is to log to stdout </dd>
|
||||
<dt> --verbose </dt>
|
||||
<dd> Increase verbosity to the first level, less verbose.</dd>
|
||||
<dt> --debug </dt>
|
||||
<dd> Increase verbosity to the last level, more verbose.</dd>
|
||||
<dt> -v </dt>
|
||||
<dd> *DEPRECATED: see --verbose/debug* Increase verbosity. There are multiple levels of verbosity available with
|
||||
'-vv' currently being the highest </dd>
|
||||
<dt> --pluginpath PLUGIN_PATH </dt>
|
||||
<dd> A colon-delimted path to find other logstash plugins in </dd>
|
||||
</dl>
|
||||
|
||||
|
||||
## Web
|
||||
|
||||
<dl>
|
||||
<dt> -a, --address ADDRESS </dt>
|
||||
<dd>Address on which to start webserver. Default is 0.0.0.0.</dd>
|
||||
<dt> -p, --port PORT</dt>
|
||||
<dd>Port on which to start webserver. Default is 9292.</dd>
|
||||
</dl>
|
||||
|
|
@ -1,28 +0,0 @@
|
|||
#!/usr/bin/env ruby
|
||||
|
||||
require "erb"
|
||||
|
||||
if ARGV.size != 1
|
||||
$stderr.puts "No path given to search for plugin docs"
|
||||
$stderr.puts "Usage: #{$0} plugin_doc_dir"
|
||||
exit 1
|
||||
end
|
||||
|
||||
def plugins(glob)
|
||||
files = Dir.glob(glob)
|
||||
names = files.collect { |f| File.basename(f).gsub(".html", "") }
|
||||
return names.sort
|
||||
end # def plugins
|
||||
|
||||
basedir = ARGV[0]
|
||||
docs = {
|
||||
"inputs" => plugins(File.join(basedir, "inputs/*.html")),
|
||||
"codecs" => plugins(File.join(basedir, "codecs/*.html")),
|
||||
"filters" => plugins(File.join(basedir, "filters/*.html")),
|
||||
"outputs" => plugins(File.join(basedir, "outputs/*.html")),
|
||||
}
|
||||
|
||||
template_path = File.join(File.dirname(__FILE__), "index.html.erb")
|
||||
template = File.new(template_path).read
|
||||
erb = ERB.new(template, nil, "-")
|
||||
puts erb.result(binding)
|
|
@ -1,46 +0,0 @@
|
|||
---
|
||||
title: Learn - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# What is Logstash?
|
||||
|
||||
Logstash is a tool for managing your logs.
|
||||
|
||||
It helps you take logs and other event data from your systems and move it into
|
||||
a central place. Logstash is open source and completely free. You can find
|
||||
support on the discussion forum and on IRC.
|
||||
|
||||
For an overview of Logstash and why you would use it, you should watch the
|
||||
presentation I gave at CarolinaCon 2011:
|
||||
[video here](http://carolinacon.blip.tv/file/5105901/). This presentation covers
|
||||
Logstash, how you can use it, some alternatives, logging best practices,
|
||||
parsing tools, etc. Video also below:
|
||||
|
||||
<!--
|
||||
<embed src="http://blip.tv/play/gvE9grjcdQI" type="application/x-shockwave-flash" width="480" height="296" allowscriptaccess="always" allowfullscreen="true"></embed>
|
||||
|
||||
The slides are available online here: [slides](http://goo.gl/68c62). The slides
|
||||
include speaker notes (click 'actions' then 'speaker notes').
|
||||
-->
|
||||
<iframe width="480" height="296" src="http://www.youtube.com/embed/RuUFnog29M4" frameborder="0" allowfullscreen="allowfullscreen"></iframe>
|
||||
|
||||
The slides are available online here: [slides](http://semicomplete.com/presentations/logstash-puppetconf-2012/).
|
||||
|
||||
## Getting Help
|
||||
|
||||
There's [documentation](.) here on this site. If that isn't sufficient, you can
|
||||
use the discussion [forum](https://discuss.elastic.co/c/logstash). Further, there is also
|
||||
an IRC channel - #logstash on irc.freenode.org.
|
||||
|
||||
If you find a bug or have a feature request, file them
|
||||
on [github](https://github.com/elasticsearch/logstas/issues). (Honestly though, if you prefer email or irc
|
||||
for such things, that works for me, too.)
|
||||
|
||||
## Download It
|
||||
|
||||
[Download logstash-%VERSION%](https://download.elastic.co/logstash/logstash/logstash-%VERSION%.tar.gz)
|
||||
|
||||
## What's next?
|
||||
|
||||
Try this [guide](tutorials/getting-started-with-logstash) for a simple
|
||||
real-world example getting started using Logstash.
|
|
@ -1,109 +0,0 @@
|
|||
---
|
||||
title: the life of an event - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# the life of an event
|
||||
|
||||
The logstash agent is an event pipeline.
|
||||
|
||||
## The Pipeline
|
||||
|
||||
The logstash agent is a processing pipeline with 3 stages: inputs -> filters ->
|
||||
outputs. Inputs generate events, filters modify them, outputs ship them
|
||||
elsewhere.
|
||||
|
||||
Internal to logstash, events are passed from each phase using internal queues.
|
||||
It is implemented with a 'SizedQueue' in Ruby. SizedQueue allows a bounded
|
||||
maximum of items in the queue such that any writes to the queue will block if
|
||||
the queue is full at maximum capacity.
|
||||
|
||||
Logstash sets each queue size to 20. This means only 20 events can be pending
|
||||
into the next phase - this helps reduce any data loss and in general avoids
|
||||
logstash trying to act as a data storage system. These internal queues are not
|
||||
for storing messages long-term.
|
||||
|
||||
## Fault Tolerance
|
||||
|
||||
Starting at outputs, here's what happens when things break.
|
||||
|
||||
An output can fail or have problems because of some downstream cause, such as
|
||||
full disk, permissions problems, temporary network failures, or service
|
||||
outages. Most outputs should keep retrying to ship any events that were
|
||||
involved in the failure.
|
||||
|
||||
If an output is failing, the output thread will wait until this output is
|
||||
healthy again and able to successfully send the message. Therefore, the output
|
||||
queue will stop being read from by this output and will eventually fill up with
|
||||
events and block new events from being written to this queue.
|
||||
|
||||
A full output queue means filters will block trying to write to the output
|
||||
queue. Because filters will be stuck, blocked writing to the output queue, they
|
||||
will stop reading from the filter queue which will eventually cause the filter
|
||||
queue (input -> filter) to fill up.
|
||||
|
||||
A full filter queue will cause inputs to block when writing to the filters.
|
||||
This will cause each input to block, causing each input to stop processing new
|
||||
data from wherever that input is getting new events.
|
||||
|
||||
In ideal circumstances, this will behave similarly to when the tcp window
|
||||
closes to 0, no new data is sent because the receiver hasn't finished
|
||||
processing the current queue of data, but as soon as the downstream (output)
|
||||
problem is resolved, messages will begin flowing again..
|
||||
|
||||
## Thread Model
|
||||
|
||||
The thread model in logstash is currently:
|
||||
|
||||
input threads | filter worker threads | output worker
|
||||
|
||||
Filters are optional, so you will have this model if you have no filters
|
||||
defined:
|
||||
|
||||
input threads | output worker
|
||||
|
||||
Each input runs in a thread by itself. This allows busier inputs to not be
|
||||
blocked by slower ones, etc. It also allows for easier containment of scope
|
||||
because each input has a thread.
|
||||
|
||||
The filter thread model is a 'worker' model where each worker receives an event
|
||||
and applies all filters, in order, before emitting that to the output queue.
|
||||
This allows scalability across CPUs because many filters are CPU intensive
|
||||
(permitting that we have thread safety).
|
||||
|
||||
The default number of filter workers is 1, but you can increase this number
|
||||
with the '-w' flag on the agent.
|
||||
|
||||
The output worker model is currently a single thread. Outputs will receive
|
||||
events in the order they are defined in the config file.
|
||||
|
||||
Outputs may decide to buffer events temporarily before publishing them,
|
||||
possibly in a separate thread. One example of this is the elasticsearch output
|
||||
which will buffer events and flush them all at once, in a separate thread. This
|
||||
mechanism (buffering many events + writing in a separate thread) can improve
|
||||
performance so the logstash pipeline isn't stalled waiting for a response from
|
||||
elasticsearch.
|
||||
|
||||
## Consequences and Expectations
|
||||
|
||||
Small queue sizes mean that logstash simply blocks and stalls safely during
|
||||
times of load or other temporary pipeline problems. There are two alternatives
|
||||
to this - unlimited queue length and dropping messages. Unlimited queues grow
|
||||
grow unbounded and eventually exceed memory causing a crash which loses all of
|
||||
those messages. Dropping messages is also an undesirable behavior in most cases.
|
||||
|
||||
At a minimum, logstash will have probably 3 threads (2 if you have no filters).
|
||||
One input, one filter worker, and one output thread each.
|
||||
|
||||
If you see logstash using multiple CPUs, this is likely why. If you want to
|
||||
know more about what each thread is doing, you should read this:
|
||||
<http://www.semicomplete.com/blog/geekery/debugging-java-performance.html>.
|
||||
|
||||
Threads in java have names, and you can use jstack and top to figure out who is
|
||||
using what resources. The URL above will help you learn how to do this.
|
||||
|
||||
On Linux platforms, logstash will label all the threads it can with something
|
||||
descriptive. Inputs will show up as "<inputname" and filter workers as
|
||||
"|worker" and outputs as ">outputworker" (or something similar). Other threads
|
||||
may be labeled as well, and are intended to help you identify their purpose
|
||||
should you wonder why they are consuming resources!
|
||||
|
|
@ -1,60 +0,0 @@
|
|||
---
|
||||
title: Logging tools comparisons - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Logging tools comparison
|
||||
|
||||
The information below is provided as "best effort" and is not strictly intended
|
||||
as a complete source of truth. If the information below is unclear or incorrect, please
|
||||
email the logstash-users list (or send a pull request with the fix) :)
|
||||
|
||||
Where feasible, this document will also provide information on how you can use
|
||||
logstash with these other projects.
|
||||
|
||||
# logstash
|
||||
|
||||
Primary goal: Make log/event data and analytics accessible.
|
||||
|
||||
Overview: Where your logs come from, how you store them, or what you do with
|
||||
them is up to you. Logstash exists to help make such actions easier and faster.
|
||||
|
||||
It provides you a simple event pipeline for taking events and logs from any
|
||||
input, manipulating them with filters, and sending them to any output. Inputs
|
||||
can be files, network, message brokers, etc. Filters are date and string
|
||||
parsers, grep-like, etc. Outputs are data stores (elasticsearch, mongodb, etc),
|
||||
message systems (rabbitmq, stomp, etc), network (tcp, syslog), etc.
|
||||
|
||||
It also provides a web interface for doing search and analytics on your
|
||||
logs.
|
||||
|
||||
# graylog2
|
||||
|
||||
[http://graylog2.org/](http://graylog2.org)
|
||||
|
||||
_Overview to be written_
|
||||
|
||||
You can use graylog2 with logstash by using the 'gelf' output to send logstash
|
||||
events to a graylog2 server. This gives you logstash's excellent input and
|
||||
filter features while still being able to use the graylog2 web interface.
|
||||
|
||||
# whoops
|
||||
|
||||
[whoops site](http://www.whoopsapp.com/)
|
||||
|
||||
_Overview to be written_
|
||||
|
||||
A logstash output to whoops is coming soon - <https://logstash.jira.com/browse/LOGSTASH-133>
|
||||
|
||||
# flume
|
||||
|
||||
[flume site](https://github.com/cloudera/flume/wiki)
|
||||
|
||||
Flume is primarily a transport system aimed at reliably copying logs from
|
||||
application servers to HDFS.
|
||||
|
||||
You can use it with logstash by having a syslog sink configured to shoot logs
|
||||
at a logstash syslog input.
|
||||
|
||||
# scribe
|
||||
|
||||
_Overview to be written_
|
|
@ -1,41 +0,0 @@
|
|||
---
|
||||
title: Plugin Milestones - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Plugin Milestones
|
||||
|
||||
Plugins (inputs/outputs/filters/codecs) have a milestone label in logstash.
|
||||
This is to provide an indicator to the end-user as to the kinds of changes
|
||||
a given plugin could have between logstash releases.
|
||||
|
||||
The desire here is to allow plugin developers to quickly iterate on possible
|
||||
new plugins while conveying to the end-user a set of expectations about that
|
||||
plugin.
|
||||
|
||||
## Milestone 1
|
||||
|
||||
Plugins at this milestone need your feedback to improve! Plugins at this
|
||||
milestone may change between releases as the community figures out the best way
|
||||
for the plugin to behave and be configured.
|
||||
|
||||
## Milestone 2
|
||||
|
||||
Plugins at this milestone are more likely to have backwards-compatibility to
|
||||
previous releases than do Milestone 1 plugins. This milestone also indicates
|
||||
a greater level of in-the-wild usage by the community than the previous
|
||||
milestone.
|
||||
|
||||
## Milestone 3
|
||||
|
||||
Plugins at this milestone have strong promises towards backwards-compatibility.
|
||||
This is enforced with automated tests to ensure behavior and configuration are
|
||||
consistent across releases.
|
||||
|
||||
## Milestone 0
|
||||
|
||||
This milestone appears at the bottom of the page because it is very
|
||||
infrequently used.
|
||||
|
||||
This milestone marker is used to generally indicate that a plugin has no
|
||||
active code maintainer nor does it have support from the community in terms
|
||||
of getting help.
|
|
@ -1,64 +0,0 @@
|
|||
---
|
||||
title: release notes for %VERSION%
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
# %VERSION% - Release Notes
|
||||
|
||||
This document is targeted at existing users of Logstash who are upgrading from
|
||||
an older version to version %VERSION%. This document is intended to supplement
|
||||
a the [changelog
|
||||
file](https://github.com/elasticsearch/logstash/blob/v%VERSION%/CHANGELOG) by
|
||||
providing more details on certain changes.
|
||||
|
||||
### tarball
|
||||
|
||||
With Logstash 1.4.0, we stopped shipping the jar file and started shipping a
|
||||
tarball instead.
|
||||
|
||||
Past releases have been a single jar file which included all Ruby and Java
|
||||
library dependencies to eliminate deployment pains. We still ship all
|
||||
the dependencies for you! The jar file served us well, but over time we found
|
||||
Java’s default heap size, garbage collector, and other settings weren’t well
|
||||
suited to Logstash.
|
||||
|
||||
In order to provide better Java defaults, we’ve changed to releasing a tarball
|
||||
(.tar.gz) that includes all the same dependencies. What does this mean to you?
|
||||
Instead of running `java -jar logstash.jar ...` you run `bin/logstash ...` (for
|
||||
Windows users, `bin/logstash.bat`)
|
||||
|
||||
One pleasant side effect of using a tarball is that the Logstash code itself is
|
||||
much more accessible and able to satisfy any curiosity you may have.
|
||||
|
||||
The new way to do things is:
|
||||
|
||||
* Download logstash tarball
|
||||
* Unpack it (`tar -zxf logstash-%VERSION%.tar.gz`)
|
||||
* `cd logstash-%VERSION%`
|
||||
% Run it: `bin/logstash ...`
|
||||
|
||||
The old way to run logstash of `java -jar logstash.jar` is now replaced with
|
||||
`bin/logstash`. The command line arguments are exactly the same after that.
|
||||
For example:
|
||||
|
||||
# Old way:
|
||||
`% java -jar logstash-1.3.3-flatjar.jar agent -f logstash.conf`
|
||||
|
||||
# New way:
|
||||
`% bin/logstash agent -f logstash.conf`
|
||||
|
||||
### plugins
|
||||
|
||||
Logstash has grown brilliantly over the past few years with great contributions
|
||||
from the community. Now having 165 plugins, it became hard for us (the Logstash
|
||||
engineering team) to reliably support all the wonderful technologies in each
|
||||
contributed plugin. We combed through all the plugins and picked the ones we
|
||||
felt strongly we could support, and those now ship by default with Logstash.
|
||||
|
||||
All the other plugins are now available in a contrib package. All plugins
|
||||
continue to be open source and free, of course! Installing plugins is very easy:
|
||||
|
||||
....
|
||||
% cd /path/to/logstash-%VERSION%/
|
||||
% bin/plugin install [PLUGIN_NAME]
|
||||
....
|
|
@ -1,35 +0,0 @@
|
|||
---
|
||||
title: repositories - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Logstash repositories
|
||||
|
||||
We also have Logstash available as APT and YUM repositories.
|
||||
|
||||
Our public signing key can be found on the [Elasticsearch packages apt GPG signing key page](https://packages.elasticsearch.org/GPG-KEY-elasticsearch)
|
||||
|
||||
## Apt based distributions
|
||||
|
||||
Add the key:
|
||||
|
||||
wget -O - https://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add -
|
||||
|
||||
Add the repo to /etc/apt/sources.list
|
||||
|
||||
deb http://packages.elasticsearch.org/logstash/1.4/debian stable main
|
||||
|
||||
|
||||
## YUM based distributions
|
||||
|
||||
Add the key:
|
||||
|
||||
rpm --import https://packages.elasticsearch.org/GPG-KEY-elasticsearch
|
||||
|
||||
Add the repo to /etc/yum.repos.d/ directory
|
||||
|
||||
[logstash-1.4]
|
||||
name=logstash repository for 1.4.x packages
|
||||
baseurl=https://packages.elasticsearch.org/logstash/1.4/centos
|
||||
gpgcheck=1
|
||||
gpgkey=https://packages.elasticsearch.org/GPG-KEY-elasticsearch
|
||||
enabled=1
|
Before Width: | Height: | Size: 77 KiB After Width: | Height: | Size: 77 KiB |
Before Width: | Height: | Size: 33 KiB After Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 45 KiB After Width: | Height: | Size: 45 KiB |
Before Width: | Height: | Size: 55 KiB After Width: | Height: | Size: 55 KiB |
Before Width: | Height: | Size: 56 KiB After Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 105 KiB After Width: | Height: | Size: 105 KiB |
Before Width: | Height: | Size: 164 KiB After Width: | Height: | Size: 164 KiB |
Before Width: | Height: | Size: 172 KiB After Width: | Height: | Size: 172 KiB |
Before Width: | Height: | Size: 68 KiB After Width: | Height: | Size: 68 KiB |
|
@ -1,15 +1,27 @@
|
|||
[[working-with-plugins]]
|
||||
== Working with plugins
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Logstash has a rich collection of input, filter, codec and output plugins. Plugins are available as self-contained packages called gems and hosted on RubyGems.org. The plugin manager accesed via `bin/plugin` script is used to manage the lifecycle of plugins in your Logstash deployment. You can install, uninstall and upgrade plugins using these Command Line Interface (CLI) described below.
|
||||
|
||||
NOTE: Some sections here are for advanced users
|
||||
=======
|
||||
Logstash has a rich collection of input, filter, codec and output plugins. Plugins are available as self-contained
|
||||
packages called gems and hosted on RubyGems.org. The plugin manager accesed via `bin/plugin` script is used to manage the
|
||||
lifecycle of plugins in your Logstash deployment. You can install, uninstall and upgrade plugins using these Command Line
|
||||
Interface (CLI) described below.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[float]
|
||||
[[listing-plugins]]
|
||||
=== Listing plugins
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Logstash release packages bundle common plugins so you can use them out of the box. To list the plugins currently available in your deployment:
|
||||
=======
|
||||
Logstash release packages bundle common plugins so you can use them out of the box. To list the plugins currently
|
||||
available in your deployment:
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -30,7 +42,13 @@ bin/plugin list --group output <4>
|
|||
[[installing-plugins]]
|
||||
=== Adding plugins to your deployment
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
The most common situation when dealing with plugin installation is when you have access to internet. Using this method, you will be able to retrieve plugins hosted on the public repository (RubyGems.org) and install on top of your Logstash installation.
|
||||
=======
|
||||
The most common situation when dealing with plugin installation is when you have access to internet. Using this method,
|
||||
you will be able to retrieve plugins hosted on the public repository (RubyGems.org) and install on top of your Logstash
|
||||
installation.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -43,7 +61,12 @@ Once the plugin is successfully installed, you can start using it in your config
|
|||
[float]
|
||||
==== Advanced: Adding a locally built plugin
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
In some cases, you want to install plugins which have not yet been released and not hosted on RubyGems.org. Logstash provides you the option to install a locally built plugin which is packaged as a ruby gem. Using a file location:
|
||||
=======
|
||||
In some cases, you want to install plugins which have not yet been released and not hosted on RubyGems.org. Logstash
|
||||
provides you the option to install a locally built plugin which is packaged as a ruby gem. Using a file location:
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -54,7 +77,12 @@ bin/plugin install /path/to/logstash-output-kafka-1.0.0.gem
|
|||
[float]
|
||||
==== Advanced: Using `--pluginpath`
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Using the `--pluginpath` flag, you can load a plugin source code located on your file system. Typically this is used by developers who are iterating on a custom plugin and want to test it before creating a ruby gem.
|
||||
=======
|
||||
Using the `--pluginpath` flag, you can load a plugin source code located on your file system. Typically this is used by
|
||||
developers who are iterating on a custom plugin and want to test it before creating a ruby gem.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -65,7 +93,12 @@ bin/logstash --pluginpath /opt/shared/lib/logstash/input/my-custom-plugin-code.r
|
|||
[float]
|
||||
=== Updating plugins
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
Plugins have their own release cycle and are often released independent of Logstash’s core release cycle. Using the update sub-command you can get the latest or update to a particular version of the plugin.
|
||||
=======
|
||||
Plugins have their own release cycle and are often released independent of Logstash’s core release cycle. Using the update
|
||||
subcommand you can get the latest or update to a particular version of the plugin.
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
||||
|
@ -91,7 +124,13 @@ bin/plugin uninstall logstash-output-kafka
|
|||
[float]
|
||||
=== Proxy Support
|
||||
|
||||
<<<<<<< HEAD:docs/asciidoc/static/plugin-manager.asciidoc
|
||||
The previous sections relied on Logstash being able to communicate with RubyGems.org. In certain environments, Forwarding Proxy is used to handle HTTP requests. Logstash Plugins can be installed and updated through a Proxy by setting the `HTTP_PROXY` environment variable:
|
||||
=======
|
||||
The previous sections relied on Logstash being able to communicate with RubyGems.org. In certain environments, Forwarding
|
||||
Proxy is used to handle HTTP requests. Logstash Plugins can be installed and updated through a Proxy by setting the
|
||||
`HTTP_PROXY` environment variable:
|
||||
>>>>>>> 9477db2... Cleanup docs directory:docs/static/plugin-manager.asciidoc
|
||||
|
||||
[source,shell]
|
||||
----------------------------------
|
53
docs/static/private-gem-repo.asciidoc
vendored
Normal file
|
@ -0,0 +1,53 @@
|
|||
[[private-rubygem]]
|
||||
=== Private Gem Repositories
|
||||
|
||||
The Logstash plugin manager connects to a Ruby gems repository to install and update Logstash plugins. By default, this
|
||||
repository is http://rubygems.org.
|
||||
|
||||
Some use cases are unable to use the default repository, as in the following examples:
|
||||
|
||||
* A firewall blocks access to the default repository.
|
||||
* You are developing your own plugins locally.
|
||||
* Airgap requirements on the local system.
|
||||
|
||||
When you use a custom gem repository, be sure to make plugin dependencies available.
|
||||
|
||||
Several open source projects enable you to run your own plugin server, among them:
|
||||
|
||||
* https://github.com/geminabox/geminabox[Geminabox]
|
||||
* https://github.com/PierreRambaud/gemirro[Gemirro]
|
||||
* https://gemfury.com/[Gemfury]
|
||||
* http://www.jfrog.com/open-source/[Artifactory]
|
||||
|
||||
==== Editing the Gemfile
|
||||
|
||||
The gemfile is a configuration file that specifies information required for plugin management. Each gem file has a
|
||||
`source` line that specifies a location for plugin content.
|
||||
|
||||
By default, the gemfile's `source` line reads:
|
||||
|
||||
[source,shell]
|
||||
----------
|
||||
# This is a Logstash generated Gemfile.
|
||||
# If you modify this file manually all comments and formatting will be lost.
|
||||
|
||||
source "https://rubygems.org"
|
||||
----------
|
||||
|
||||
To change the source, edit the `source` line to contain your preferred source, as in the following example:
|
||||
|
||||
[source,shell]
|
||||
----------
|
||||
# This is a Logstash generated Gemfile.
|
||||
# If you modify this file manually all comments and formatting will be lost.
|
||||
|
||||
source "https://my.private.repository"
|
||||
----------
|
||||
|
||||
After saving the new version of the gemfile, use <<working-with-plugins,plugin management commands>> normally.
|
||||
|
||||
The following links contain further material on setting up some commonly used repositories:
|
||||
|
||||
* https://github.com/geminabox/geminabox/blob/master/README.markdown[Geminabox]
|
||||
* https://www.jfrog.com/confluence/display/RTF/RubyGems+Repositories[Artifactory]
|
||||
* Running a http://guides.rubygems.org/run-your-own-gem-server/[rubygems mirror]
|
|
@ -77,4 +77,3 @@ of workers by passing a command line flag such as:
|
|||
|
||||
[source,shell]
|
||||
bin/logstash `-w 1`
|
||||
|
|
@ -1,35 +0,0 @@
|
|||
input {
|
||||
tcp {
|
||||
type => "apache"
|
||||
port => 3333
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [type] == "apache" {
|
||||
grok {
|
||||
# See the following URL for a complete list of named patterns
|
||||
# logstash/grok ships with by default:
|
||||
# https://github.com/logstash/logstash/tree/master/patterns
|
||||
#
|
||||
# The grok filter will use the below pattern and on successful match use
|
||||
# any captured values as new fields in the event.
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
|
||||
date {
|
||||
# Try to pull the timestamp from the 'timestamp' field (parsed above with
|
||||
# grok). The apache time format looks like: "18/Aug/2011:05:44:34 -0700"
|
||||
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch {
|
||||
# Setting 'embedded' will run a real elasticsearch server inside logstash.
|
||||
# This option below saves you from having to run a separate process just
|
||||
# for ElasticSearch, so you can get started quicker!
|
||||
embedded => true
|
||||
}
|
||||
}
|
|
@ -1,33 +0,0 @@
|
|||
input {
|
||||
tcp {
|
||||
type => "apache"
|
||||
port => 3333
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [type] == "apache" {
|
||||
grok {
|
||||
# See the following URL for a complete list of named patterns
|
||||
# logstash/grok ships with by default:
|
||||
# https://github.com/logstash/logstash/tree/master/patterns
|
||||
#
|
||||
# The grok filter will use the below pattern and on successful match use
|
||||
# any captured values as new fields in the event.
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
|
||||
date {
|
||||
# Try to pull the timestamp from the 'timestamp' field (parsed above with
|
||||
# grok). The apache time format looks like: "18/Aug/2011:05:44:34 -0700"
|
||||
match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
# Use stdout in debug mode again to see what logstash makes of the event.
|
||||
stdout {
|
||||
codec => rubydebug
|
||||
}
|
||||
}
|
|
@ -1 +0,0 @@
|
|||
129.92.249.70 - - [18/Aug/2011:06:00:14 -0700] "GET /style2.css HTTP/1.1" 200 1820 "http://www.semicomplete.com/blog/geekery/bypassing-captive-portals.html" "Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5"
|
|
@ -1,25 +0,0 @@
|
|||
input {
|
||||
stdin {
|
||||
# A type is a label applied to an event. It is used later with filters
|
||||
# to restrict what filters are run against each event.
|
||||
type => "human"
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
# Print each event to stdout.
|
||||
stdout {
|
||||
# Enabling 'rubydebug' codec on the stdout output will make logstash
|
||||
# pretty-print the entire event as something similar to a JSON representation.
|
||||
codec => rubydebug
|
||||
}
|
||||
|
||||
# You can have multiple outputs. All events generally to all outputs.
|
||||
# Output events to elasticsearch
|
||||
elasticsearch {
|
||||
# Setting 'embedded' will run a real elasticsearch server inside logstash.
|
||||
# This option below saves you from having to run a separate process just
|
||||
# for ElasticSearch, so you can get started quicker!
|
||||
embedded => true
|
||||
}
|
||||
}
|
|
@ -1,16 +0,0 @@
|
|||
input {
|
||||
stdin {
|
||||
# A type is a label applied to an event. It is used later with filters
|
||||
# to restrict what filters are run against each event.
|
||||
type => "human"
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
# Print each event to stdout.
|
||||
stdout {
|
||||
# Enabling 'rubydebug' codec on the stdout output will make logstash
|
||||
# pretty-print the entire event as something similar to a JSON representation.
|
||||
codec => rubydebug
|
||||
}
|
||||
}
|
|
@ -1,101 +0,0 @@
|
|||
---
|
||||
title: Logstash 10-Minute Tutorial
|
||||
layout: content_right
|
||||
---
|
||||
# Logstash 10-minute Tutorial
|
||||
|
||||
## Step 1 - Download
|
||||
|
||||
### Download logstash:
|
||||
|
||||
* [logstash-%VERSION%.tar.gz](https://download.elasticsearch.org/logstash/logstash/logstash-%VERSION%.tar.gz)
|
||||
|
||||
curl -O https://download.elasticsearch.org/logstash/logstash/logstash-%VERSION%.tar.gz
|
||||
|
||||
### Unpack it
|
||||
|
||||
tar -xzf logstash-%VERSION%.tar.gz
|
||||
cd logstash-%VERSION%
|
||||
|
||||
### Requirements:
|
||||
|
||||
* Java
|
||||
|
||||
### The Secret:
|
||||
|
||||
Logstash is written in JRuby, but I release standalone jar files for easy
|
||||
deployment, so you don't need to download JRuby or most any other dependencies.
|
||||
|
||||
I bake as much as possible into the single release file.
|
||||
|
||||
## Step 2 - A hello world.
|
||||
|
||||
### Download this config file:
|
||||
|
||||
* [hello.conf](hello.conf)
|
||||
|
||||
### Run it:
|
||||
|
||||
bin/logstash agent -f hello.conf
|
||||
|
||||
Type stuff on standard input. Press enter. Watch what event Logstash sees.
|
||||
Press ^C to kill it.
|
||||
|
||||
## Step 3 - Add ElasticSearch
|
||||
|
||||
### Download this config file:
|
||||
|
||||
* [hello-search.conf](hello-search.conf)
|
||||
|
||||
### Run it:
|
||||
|
||||
bin/logstash agent -f hello-search.conf
|
||||
|
||||
Same config as step 2, but now we are also writing events to ElasticSearch. Do
|
||||
a search for `*` (all):
|
||||
|
||||
curl 'http://localhost:9200/_search?pretty=1&q=*'
|
||||
|
||||
### Download
|
||||
|
||||
* [apache-parse.conf](apache-parse.conf)
|
||||
* [apache_log.1](apache_log.1) (a single apache log line)
|
||||
|
||||
### Run it
|
||||
|
||||
bin/logstash agent -f apache-parse.conf
|
||||
|
||||
Logstash will now be listening on TCP port 3333. Send an Apache log message at it:
|
||||
|
||||
nc localhost 3333 < apache_log.1
|
||||
|
||||
The expected output can be viewed here: [step-5-output.txt](step-5-output.txt)
|
||||
|
||||
## Step 6 - real world example + search
|
||||
|
||||
Same as the previous step, but we'll output to ElasticSearch now.
|
||||
|
||||
### Download
|
||||
|
||||
* [apache-elasticsearch.conf](apache-elasticsearch.conf)
|
||||
* [apache_log.2.bz2](apache_log.2.bz2) (2 days of apache logs)
|
||||
|
||||
### Run it
|
||||
|
||||
bin/logstash agent -f apache-elasticsearch.conf
|
||||
|
||||
Logstash should be all set for you now. Start feeding it logs:
|
||||
|
||||
bzip2 -d apache_log.2.bz2
|
||||
|
||||
nc localhost 3333 < apache_log.2
|
||||
|
||||
## Want more?
|
||||
|
||||
For further learning, try these:
|
||||
|
||||
* [Watch a presentation on logstash](http://www.youtube.com/embed/RuUFnog29M4)
|
||||
* [Getting started 'standalone' guide](http://logstash.net/docs/%VERSION%/tutorials/getting-started-simple)
|
||||
* [Getting started 'centralized' guide](http://logstash.net/docs/%VERSION%/tutorials/getting-started-centralized) -
|
||||
learn how to build out your logstash infrastructure and centralize your logs.
|
||||
* [Dive into the docs](http://logstash.net/docs/%VERSION%/)
|
|
@ -1,17 +0,0 @@
|
|||
{
|
||||
"type" => "apache",
|
||||
"clientip" => "129.92.249.70",
|
||||
"ident" => "-",
|
||||
"auth" => "-",
|
||||
"timestamp" => "18/Aug/2011:06:00:14 -0700",
|
||||
"verb" => "GET",
|
||||
"request" => "/style2.css",
|
||||
"httpversion" => "1.1",
|
||||
"response" => "200",
|
||||
"bytes" => "1820",
|
||||
"referrer" => "http://www.semicomplete.com/blog/geekery/bypassing-captive-portals.html",
|
||||
"agent" => "\"Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5\"",
|
||||
"@timestamp" => "2011-08-18T13:00:14.000Z",
|
||||
"host" => "127.0.0.1",
|
||||
"message" => "129.92.249.70 - - [18/Aug/2011:06:00:14 -0700] \"GET /style2.css HTTP/1.1\" 200 1820 \"http://www.semicomplete.com/blog/geekery/bypassing-captive-portals.html\" \"Mozilla/5.0 (iPad; U; CPU OS 4_3_5 like Mac OS X; en-us) AppleWebKit/533.17.9 (KHTML, like Gecko) Version/5.0.2 Mobile/8L1 Safari/6533.18.5\"\n"
|
||||
}
|
|
@ -1,436 +0,0 @@
|
|||
= Getting Started with Logstash
|
||||
|
||||
== Introduction
|
||||
Logstash is a tool for receiving, processing and outputting logs. All kinds of logs. System logs, webserver logs, error logs, application logs and just about anything you can throw at it. Sounds great, eh?
|
||||
|
||||
Using Elasticsearch as a backend datastore, and kibana as a frontend reporting tool, Logstash acts as the workhorse, creating a powerful pipeline for storing, querying and analyzing your logs. With an arsenal of built-in inputs, filters, codecs and outputs, you can harness some powerful functionality with a small amount of effort. So, let's get started!
|
||||
|
||||
=== Prerequisite: Java
|
||||
The only prerequisite required by Logstash is a Java runtime. You can check that you have it installed by running the command `java -version` in your shell. Here's something similar to what you might see:
|
||||
----
|
||||
> java -version
|
||||
java version "1.7.0_45"
|
||||
Java(TM) SE Runtime Environment (build 1.7.0_45-b18)
|
||||
Java HotSpot(TM) 64-Bit Server VM (build 24.45-b08, mixed mode)
|
||||
----
|
||||
It is recommended to run a recent version of Java in order to ensure the greatest success in running Logstash.
|
||||
|
||||
It's fine to run an open-source version such as OpenJDK: +
|
||||
http://openjdk.java.net/
|
||||
|
||||
Or you can use the official Oracle version: +
|
||||
http://www.oracle.com/technetwork/java/index.html
|
||||
|
||||
Once you have verified the existence of Java on your system, we can move on!
|
||||
|
||||
== Up and Running!
|
||||
|
||||
=== Logstash in two commands
|
||||
First, we're going to download the 'logstash' binary and run it with a very simple configuration.
|
||||
----
|
||||
curl -O https://download.elasticsearch.org/logstash/logstash/logstash-%VERSION%.tar.gz
|
||||
----
|
||||
Now you should have the file named 'logstash-%VERSION%.tar.gz' on your local filesystem. Let's unpack it:
|
||||
----
|
||||
tar zxvf logstash-%VERSION%.tar.gz
|
||||
cd logstash-%VERSION%
|
||||
----
|
||||
Here, we are telling the *tar* command that we are sending it a gzipped file (*z* flag), that we would like to extract the file (*x* flag), that we would like to do so verbosely (*v* flag), and that we will provide a filename for *tar* (*f* flag).
|
||||
|
||||
Now let's run it:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { stdout {} }'
|
||||
----
|
||||
|
||||
Now type something into your command prompt, and you will see it output by Logstash:
|
||||
----
|
||||
hello world
|
||||
2013-11-21T01:22:14.405+0000 0.0.0.0 hello world
|
||||
----
|
||||
|
||||
OK, that's interesting... We ran Logstash with an input called "stdin", and an output named "stdout", and Logstash basically echoed back whatever we typed in some sort of structured format. Note that specifying the *-e* command line flag allows Logstash to accept a configuration directly from the command line. This is especially useful for quickly testing configurations without having to edit a file between iterations.
|
||||
|
||||
Let's try a slightly fancier example. First, you should exit Logstash by issuing a 'CTRL-D' command (or 'CTRL-C Enter') in the shell in which it is running. Now run Logstash again with the following command:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { stdout { codec => rubydebug } }'
|
||||
----
|
||||
|
||||
And then try another test input, typing the text "goodnight moon":
|
||||
----
|
||||
goodnight moon
|
||||
{
|
||||
"message" => "goodnight moon",
|
||||
"@timestamp" => "2013-11-20T23:48:05.335Z",
|
||||
"@version" => "1",
|
||||
"host" => "my-laptop"
|
||||
}
|
||||
----
|
||||
|
||||
So, by re-configuring the "stdout" output (adding a "codec"), we can change the output of Logstash. By adding inputs, outputs and filters to your configuration, it's possible to massage the log data in many ways, in order to maximize flexibility of the stored data when you are querying it.
|
||||
|
||||
== Storing logs with Elasticsearch
|
||||
Now, you're probably saying, "that's all fine and dandy, but typing all my logs into Logstash isn't really an option, and merely seeing them spit to STDOUT isn't very useful." Good point. First, let's set up Elasticsearch to store the messages we send into Logstash. If you don't have Elasticsearch already installed, you can http://www.elasticsearch.org/download/[download the RPM or DEB package], or install manually by downloading the current release tarball, by issuing the following four commands:
|
||||
----
|
||||
curl -O https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-%ELASTICSEARCH_VERSION%.tar.gz
|
||||
tar zxvf elasticsearch-%ELASTICSEARCH_VERSION%.tar.gz
|
||||
cd elasticsearch-%ELASTICSEARCH_VERSION%/
|
||||
./bin/elasticsearch
|
||||
----
|
||||
|
||||
NOTE: This tutorial specifies running Logstash %VERSION% with Elasticsearch %ELASTICSEARCH_VERSION%. Each release of Logstash has a *recommended* version of Elasticsearch to pair with. Make sure the versions match based on the http://www.elasticsearch.org/overview/logstash[Logstash version] you're running!
|
||||
|
||||
More detailed information on installing and configuring Elasticsearch can be found on http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index.html[The Elasticsearch reference pages]. However, for the purposes of Getting Started with Logstash, the default installation and configuration of Elasticsearch should be sufficient.
|
||||
|
||||
Now that we have Elasticsearch running on port 9200 (we do, right?), Logstash can be simply configured to use Elasticsearch as its backend. The defaults for both Logstash and Elasticsearch are fairly sane and well thought out, so we can omit the optional configurations within the elasticsearch output:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } }'
|
||||
----
|
||||
|
||||
Type something, and Logstash will process it as before (this time you won't see any output, since we don't have the stdout output configured)
|
||||
----
|
||||
you know, for logs
|
||||
----
|
||||
|
||||
You can confirm that ES actually received the data by making a curl request and inspecting the return:
|
||||
----
|
||||
curl 'http://localhost:9200/_search?pretty'
|
||||
----
|
||||
|
||||
which should return something like this:
|
||||
----
|
||||
{
|
||||
"took" : 2,
|
||||
"timed_out" : false,
|
||||
"_shards" : {
|
||||
"total" : 5,
|
||||
"successful" : 5,
|
||||
"failed" : 0
|
||||
},
|
||||
"hits" : {
|
||||
"total" : 1,
|
||||
"max_score" : 1.0,
|
||||
"hits" : [ {
|
||||
"_index" : "logstash-2013.11.21",
|
||||
"_type" : "logs",
|
||||
"_id" : "2ijaoKqARqGvbMgP3BspJA",
|
||||
"_score" : 1.0, "_source" : {"message":"you know, for logs","@timestamp":"2013-11-21T18:45:09.862Z","@version":"1","host":"my-laptop"}
|
||||
} ]
|
||||
}
|
||||
}
|
||||
----
|
||||
|
||||
Congratulations! You've successfully stashed logs in Elasticsearch via Logstash.
|
||||
|
||||
=== Elasticsearch Plugins (an aside)
|
||||
Another very useful tool for querying your Logstash data (and Elasticsearch in general) is the Elasticsearch-kopf plugin. Here is more information on http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-plugins.html[Elasticsearch plugins]. To install elasticsearch-kopf, simply issue the following command in your Elasticsearch directory (the same one in which you ran Elasticsearch earlier):
|
||||
----
|
||||
bin/plugin -install lmenezes/elasticsearch-kopf
|
||||
----
|
||||
Now you can browse to http://localhost:9200/_plugin/kopf[http://localhost:9200/_plugin/kopf] to browse your Elasticsearch data, settings and mappings!
|
||||
|
||||
=== Multiple Outputs
|
||||
As a quick exercise in configuring multiple Logstash outputs, let's invoke Logstash again, using both the 'stdout' as well as the 'elasticsearch' output:
|
||||
----
|
||||
bin/logstash -e 'input { stdin { } } output { elasticsearch { host => localhost } stdout { } }'
|
||||
----
|
||||
Typing a phrase will now echo back to your terminal, as well as save in Elasticsearch! (Feel free to verify this using curl, kibana or elasticsearch-kopf).
|
||||
|
||||
=== Default - Daily Indices
|
||||
You might notice that Logstash was smart enough to create a new index in Elasticsearch... The default index name is in the form of 'logstash-YYYY.MM.DD', which essentially creates one index per day. At midnight (GMT?), Logstash will automagically rotate the index to a fresh new one, with the new current day's timestamp. This allows you to keep windows of data, based on how far retroactively you'd like to query your log data. Of course, you can always archive (or re-index) your data to an alternate location, where you are able to query further into the past. If you'd like to simply delete old indices after a certain time period, you can use the https://github.com/elasticsearch/curator[Elasticsearch Curator tool].
|
||||
|
||||
== Moving On
|
||||
Now you're ready for more advanced configurations. At this point, it makes sense for a quick discussion of some of the core features of Logstash, and how they interact with the Logstash engine.
|
||||
|
||||
=== The Life of an Event
|
||||
|
||||
Inputs, Outputs, Codecs and Filters are at the heart of the Logstash configuration. By creating a pipeline of event processing, Logstash is able to extract the relevant data from your logs and make it available to elasticsearch, in order to efficiently query your data. To get you thinking about the various options available in Logstash, let's discuss some of the more common configurations currently in use. For more details, read about http://logstash.net/docs/latest/life-of-an-event[the Logstash event pipeline].
|
||||
|
||||
==== Inputs
|
||||
Inputs are the mechanism for passing log data to Logstash. Some of the more useful, commonly-used ones are:
|
||||
|
||||
* *file*: reads from a file on the filesystem, much like the UNIX command "tail -0F"
|
||||
* *syslog*: listens on the well-known port 514 for syslog messages and parses according to RFC3164 format
|
||||
* *redis*: reads from a redis server, using both redis channels and also redis lists. Redis is often used as a "broker" in a centralized Logstash installation, which queues Logstash events from remote Logstash "shippers".
|
||||
* *lumberjack*: processes events sent in the lumberjack protocol. Now called https://github.com/elasticsearch/logstash-forwarder[logstash-forwarder].
|
||||
|
||||
==== Filters
|
||||
Filters are used as intermediary processing devices in the Logstash chain. They are often combined with conditionals in order to perform a certain action on an event, if it matches particular criteria. Some useful filters:
|
||||
|
||||
* *grok*: parses arbitrary text and structure it. Grok is currently the best way in Logstash to parse unstructured log data into something structured and queryable. With 120 patterns shipped built-in to Logstash, it's more than likely you'll find one that meets your needs!
|
||||
* *mutate*: The mutate filter allows you to do general mutations to fields. You can rename, remove, replace, and modify fields in your events.
|
||||
* *drop*: drop an event completely, for example, 'debug' events.
|
||||
* *clone*: make a copy of an event, possibly adding or removing fields.
|
||||
* *geoip*: adds information about geographical location of IP addresses (and displays amazing charts in kibana)
|
||||
|
||||
==== Outputs
|
||||
Outputs are the final phase of the Logstash pipeline. An event may pass through multiple outputs during processing, but once all outputs are complete, the event has finished its execution. Some commonly used outputs include:
|
||||
|
||||
* *elasticsearch*: If you're planning to save your data in an efficient, convenient and easily queryable format... Elasticsearch is the way to go. Period. Yes, we're biased :)
|
||||
* *file*: writes event data to a file on disk.
|
||||
* *graphite*: sends event data to graphite, a popular open source tool for storing and graphing metrics. http://graphite.wikidot.com/
|
||||
* *statsd*: a service which "listens for statistics, like counters and timers, sent over UDP and sends aggregates to one or more pluggable backend services". If you're already using statsd, this could be useful for you!
|
||||
|
||||
==== Codecs
|
||||
Codecs are basically stream filters which can operate as part of an input, or an output. Codecs allow you to easily separate the transport of your messages from the serialization process. Popular codecs include 'json', 'msgpack' and 'plain' (text).
|
||||
|
||||
* *json*: encode / decode data in JSON format
|
||||
* *multiline*: Takes multiple-line text events and merge them into a single event, e.g. java exception and stacktrace messages
|
||||
|
||||
For the complete list of (current) configurations, visit the Logstash "plugin configuration" section of the http://www.elasticsearch.org/overview/logstash[Logstash documentation page].
|
||||
|
||||
|
||||
== More fun with Logstash
|
||||
=== Persistent Configuration files
|
||||
|
||||
Specifying configurations on the command line using '-e' is only so helpful, and more advanced setups will require more lengthy, long-lived configurations. First, let's create a simple configuration file, and invoke Logstash using it. Create a file named "logstash-simple.conf" and save it in the same directory as Logstash.
|
||||
|
||||
----
|
||||
input { stdin { } }
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
|
||||
Then, run this command:
|
||||
|
||||
----
|
||||
bin/logstash -f logstash-simple.conf
|
||||
----
|
||||
|
||||
Et voilà! Logstash will read in the configuration file you just created and run as in the example we saw earlier. Note that we used the '-f' to read in the file, rather than the '-e' to read the configuration from the command line. This is a very simple case, of course, so let's move on to some more complex examples.
|
||||
|
||||
=== Testing Your Configuration Files
|
||||
|
||||
After creating a new or complex configuration file, it can be helpful to quickly test that the file is formatted correctly. We can verify our configuration file is formatted correctly by using the *--configtest* flag.
|
||||
|
||||
----
|
||||
bin/logstash -f logstash-simple.conf --configtest
|
||||
----
|
||||
|
||||
=== Filters
|
||||
Filters are an in-line processing mechanism which provide the flexibility to slice and dice your data to fit your needs. Let's see one in action, namely the *grok filter*.
|
||||
|
||||
----
|
||||
input { stdin { } }
|
||||
|
||||
filter {
|
||||
grok {
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
date {
|
||||
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
Run Logstash with this configuration:
|
||||
|
||||
----
|
||||
bin/logstash -f logstash-filter.conf
|
||||
----
|
||||
|
||||
Now paste this line into the terminal (so it will be processed by the stdin input):
|
||||
----
|
||||
127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] "GET /xampp/status.php HTTP/1.1" 200 3891 "http://cadenza/xampp/navi.php" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0"
|
||||
----
|
||||
You should see something returned to STDOUT which looks like this:
|
||||
----
|
||||
{
|
||||
"message" => "127.0.0.1 - - [11/Dec/2013:00:01:45 -0800] \"GET /xampp/status.php HTTP/1.1\" 200 3891 \"http://cadenza/xampp/navi.php\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\"",
|
||||
"@timestamp" => "2013-12-11T08:01:45.000Z",
|
||||
"@version" => "1",
|
||||
"host" => "cadenza",
|
||||
"clientip" => "127.0.0.1",
|
||||
"ident" => "-",
|
||||
"auth" => "-",
|
||||
"timestamp" => "11/Dec/2013:00:01:45 -0800",
|
||||
"verb" => "GET",
|
||||
"request" => "/xampp/status.php",
|
||||
"httpversion" => "1.1",
|
||||
"response" => "200",
|
||||
"bytes" => "3891",
|
||||
"referrer" => "\"http://cadenza/xampp/navi.php\"",
|
||||
"agent" => "\"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:25.0) Gecko/20100101 Firefox/25.0\""
|
||||
}
|
||||
----
|
||||
As you can see, Logstash (with help from the *grok* filter) was able to parse the log line (which happens to be in Apache "combined log" format) and break it up into many different discrete bits of information. This will be extremely useful later when we start querying and analyzing our log data... for example, we'll be able to run reports on HTTP response codes, IP addresses, referrers, etc. very easily. There are quite a few grok patterns included with Logstash out-of-the-box, so it's quite likely if you're attempting to parse a fairly common log format, someone has already done the work for you. For more details, see the list of https://github.com/logstash/logstash/blob/master/patterns/grok-patterns[logstash grok patterns] on github.
|
||||
|
||||
The other filter used in this example is the *date* filter. This filter parses out a timestamp and uses it as the timestamp for the event (regardless of when you're ingesting the log data). You'll notice that the @timestamp field in this example is set to December 11, 2013, even though Logstash is ingesting the event at some point afterwards. This is handy when backfilling logs, for example... the ability to tell Logstash "use this value as the timestamp for this event". For non-english installation you may have to precise the locale in date filter (locale => en).
|
||||
|
||||
== Useful Examples
|
||||
|
||||
=== Apache logs (from files)
|
||||
Now, let's configure something actually *useful*... apache2 access log files! We are going to read the input from a file on the localhost, and use a *conditional* to process the event according to our needs. First, create a file called something like 'logstash-apache.conf' with the following contents (you'll need to change the log's file path to suit your needs):
|
||||
|
||||
----
|
||||
input {
|
||||
file {
|
||||
path => "/tmp/access_log"
|
||||
start_position => "beginning"
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [path] =~ "access" {
|
||||
mutate { replace => { "type" => "apache_access" } }
|
||||
grok {
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
}
|
||||
date {
|
||||
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch {
|
||||
host => localhost
|
||||
}
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
|
||||
----
|
||||
Then, create the file you configured above (in this example, "/tmp/access_log") with the following log lines as contents (or use some from your own webserver):
|
||||
|
||||
----
|
||||
71.141.244.242 - kurt [18/May/2011:01:48:10 -0700] "GET /admin HTTP/1.1" 301 566 "-" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3"
|
||||
134.39.72.245 - - [18/May/2011:12:40:18 -0700] "GET /favicon.ico HTTP/1.1" 200 1189 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0C; .NET4.0E)"
|
||||
98.83.179.51 - - [18/May/2011:19:35:08 -0700] "GET /css/main.css HTTP/1.1" 200 1837 "http://www.safesand.com/information.htm" "Mozilla/5.0 (Windows NT 6.0; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"
|
||||
----
|
||||
|
||||
Now run it with the -f flag as in the last example:
|
||||
----
|
||||
bin/logstash -f logstash-apache.conf
|
||||
----
|
||||
You should be able to see your apache log data in Elasticsearch now! You'll notice that Logstash opened the file you configured, and read through it, processing any events it encountered. Any additional lines logged to this file will also be captured, processed by Logstash as events and stored in Elasticsearch. As an added bonus, they will be stashed with the field "type" set to "apache_access" (this is done by the type => "apache_access" line in the input configuration).
|
||||
|
||||
In this configuration, Logstash is only watching the apache access_log, but it's easy enough to watch both the access_log and the error_log (actually, any file matching '*log'), by changing one line in the above configuration, like this:
|
||||
|
||||
----
|
||||
input {
|
||||
file {
|
||||
path => "/tmp/*_log"
|
||||
...
|
||||
----
|
||||
Now, rerun Logstash, and you will see both the error and access logs processed via Logstash. However, if you inspect your data (using elasticsearch-kopf, perhaps), you will see that the access_log was broken up into discrete fields, but not the error_log. That's because we used a "grok" filter to match the standard combined apache log format and automatically split the data into separate fields. Wouldn't it be nice *if* we could control how a line was parsed, based on its format? Well, we can...
|
||||
|
||||
Also, you might have noticed that Logstash did not reprocess the events which were already seen in the access_log file. Logstash is able to save its position in files, only processing new lines as they are added to the file. Neat!
|
||||
|
||||
=== Conditionals
|
||||
Now we can build on the previous example, where we introduced the concept of a *conditional*. A conditional should be familiar to most Logstash users, in the general sense. You may use 'if', 'else if' and 'else' statements, as in many other programming languages. Let's label each event according to which file it appeared in (access_log, error_log and other random files which end with "log").
|
||||
|
||||
----
|
||||
input {
|
||||
file {
|
||||
path => "/tmp/*_log"
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [path] =~ "access" {
|
||||
mutate { replace => { type => "apache_access" } }
|
||||
grok {
|
||||
match => { "message" => "%{COMBINEDAPACHELOG}" }
|
||||
}
|
||||
date {
|
||||
match => [ "timestamp" , "dd/MMM/yyyy:HH:mm:ss Z" ]
|
||||
}
|
||||
} else if [path] =~ "error" {
|
||||
mutate { replace => { type => "apache_error" } }
|
||||
} else {
|
||||
mutate { replace => { type => "random_logs" } }
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
|
||||
You'll notice we've labeled all events using the "type" field, but we didn't actually parse the "error" or "random" files... There are so many types of error logs that it's better left as an exercise for you, depending on the logs you're seeing.
|
||||
|
||||
=== Syslog
|
||||
OK, now we can move on to another incredibly useful example: *syslog*. Syslog is one of the most common use cases for Logstash, and one it handles exceedingly well (as long as the log lines conform roughly to RFC3164 :). Syslog is the de facto UNIX networked logging standard, sending messages from client machines to a local file, or to a centralized log server via rsyslog. For this example, you won't need a functioning syslog instance; we'll fake it from the command line, so you can get a feel for what happens.
|
||||
|
||||
First, let's make a simple configuration file for Logstash + syslog, called 'logstash-syslog.conf'.
|
||||
|
||||
----
|
||||
input {
|
||||
tcp {
|
||||
port => 5000
|
||||
type => syslog
|
||||
}
|
||||
udp {
|
||||
port => 5000
|
||||
type => syslog
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
if [type] == "syslog" {
|
||||
grok {
|
||||
match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
|
||||
add_field => [ "received_at", "%{@timestamp}" ]
|
||||
add_field => [ "received_from", "%{host}" ]
|
||||
}
|
||||
syslog_pri { }
|
||||
date {
|
||||
match => [ "syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { host => localhost }
|
||||
stdout { codec => rubydebug }
|
||||
}
|
||||
----
|
||||
Run it as normal:
|
||||
----
|
||||
bin/logstash -f logstash-syslog.conf
|
||||
----
|
||||
Normally, a client machine would connect to the Logstash instance on port 5000 and send its message. In this simplified case, we're simply going to telnet to Logstash and enter a log line (similar to how we entered log lines into STDIN earlier). First, open another shell window to interact with the Logstash syslog input and type the following command:
|
||||
|
||||
----
|
||||
telnet localhost 5000
|
||||
----
|
||||
|
||||
You can copy and paste the following lines as samples (feel free to try some of your own, but keep in mind they might not parse if the grok filter is not correct for your data):
|
||||
|
||||
----
|
||||
Dec 23 12:11:43 louis postfix/smtpd[31499]: connect from unknown[95.75.93.154]
|
||||
Dec 23 14:42:56 louis named[16000]: client 199.48.164.7#64817: query (cache) 'amsterdamboothuren.com/MX/IN' denied
|
||||
Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)
|
||||
Dec 22 18:28:06 louis rsyslogd: [origin software="rsyslogd" swVersion="4.2.0" x-pid="2253" x-info="http://www.rsyslog.com"] rsyslogd was HUPed, type 'lightweight'.
|
||||
----
|
||||
|
||||
Now you should see the output of Logstash in your original shell as it processes and parses messages!
|
||||
|
||||
----
|
||||
{
|
||||
"message" => "Dec 23 14:30:01 louis CRON[619]: (www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)",
|
||||
"@timestamp" => "2013-12-23T22:30:01.000Z",
|
||||
"@version" => "1",
|
||||
"type" => "syslog",
|
||||
"host" => "0:0:0:0:0:0:0:1:52617",
|
||||
"syslog_timestamp" => "Dec 23 14:30:01",
|
||||
"syslog_hostname" => "louis",
|
||||
"syslog_program" => "CRON",
|
||||
"syslog_pid" => "619",
|
||||
"syslog_message" => "(www-data) CMD (php /usr/share/cacti/site/poller.php >/dev/null 2>/var/log/cacti/poller-error.log)",
|
||||
"received_at" => "2013-12-23 22:49:22 UTC",
|
||||
"received_from" => "0:0:0:0:0:0:0:1:52617",
|
||||
"syslog_severity_code" => 5,
|
||||
"syslog_facility_code" => 1,
|
||||
"syslog_facility" => "user-level",
|
||||
"syslog_severity" => "notice"
|
||||
}
|
||||
----
|
||||
|
||||
Congratulations! You're well on your way to being a real Logstash power user. You should be comfortable configuring, running and sending events to Logstash, but there's much more to explore.
|
|
@ -1,201 +0,0 @@
|
|||
---
|
||||
title: Just Enough RabbitMQ - logstash
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
While configuring your RabbitMQ broker is out of scope for logstash, it's important
|
||||
to understand how logstash uses RabbitMQ. To do that, we need to understand a
|
||||
little about AMQP.
|
||||
|
||||
You should also consider reading
|
||||
[this](http://www.rabbitmq.com/tutorials/amqp-concepts.html) at the RabbitMQ
|
||||
website.
|
||||
|
||||
# Exchanges, queues and bindings; OH MY!
|
||||
|
||||
You can get a long way by understanding a few key terms.
|
||||
|
||||
## Exchanges
|
||||
|
||||
Exchanges are for message **producers**. In Logstash, we map these to
|
||||
**outputs**. Logstash puts messages on exchanges. There are many types of
|
||||
exchanges and they are discussed below.
|
||||
|
||||
## Queues
|
||||
|
||||
Queues are for message **consumers**. In Logstash, we map these to inputs.
|
||||
Logstash reads messages from queues. Optionally, queues can consume only a
|
||||
subset of messages. This is done with "routing keys".
|
||||
|
||||
## Bindings
|
||||
|
||||
Just having a producer and a consumer is not enough. We must `bind` a queue to
|
||||
an exchange. When we bind a queue to an exchange, we can optionally provide a
|
||||
routing key. Routing keys are discussed below.
|
||||
|
||||
## Broker
|
||||
|
||||
A broker is simply the AMQP server software. There are several brokers, but this
|
||||
tutorial will cover the most common (and arguably popular), [RabbitMQ](http://www.rabbitmq.com).
|
||||
|
||||
# Routing Keys
|
||||
|
||||
Simply put, routing keys are somewhat like tags for messages. In practice, they
|
||||
are hierarchical in nature with the each level separated by a dot:
|
||||
|
||||
- `messages.servers.production`
|
||||
- `sports.atlanta.baseball`
|
||||
- `company.myorg.mydepartment`
|
||||
|
||||
Routing keys are really handy with a tool like logstash where you
|
||||
can programatically define the routing key for a given event using the metadata that logstash provides:
|
||||
|
||||
- `logs.servers.production.host1`
|
||||
- `logs.servers.development.host1.syslog`
|
||||
- `logs.servers.application_foo.critical`
|
||||
|
||||
From a consumer/queue perspective, routing keys also support two types wildcards - `#` and `*`.
|
||||
|
||||
- `*` (asterisk) matches any single word.
|
||||
- `#` (hash) matches any number of words and behaves like a traditional wildcard.
|
||||
|
||||
Using the above examples, if you wanted to bind to an exchange and see messages
|
||||
for just production, you would use the routing key `logs.servers.production.*`.
|
||||
If you wanted to see messages for host1, regardless of environment you could
|
||||
use `logs.servers.%.host1.#`.
|
||||
|
||||
Wildcards can be a bit confusing but a good general rule to follow is to use
|
||||
`*` in places where you need wildcards for a known element. Use `#` when you
|
||||
need to match any remaining placeholders. Note that wildcards in routing keys
|
||||
only make sense on the consumer/queue binding, not in the publishing/exchange
|
||||
side.
|
||||
|
||||
We'll get into some of that neat stuff below. For now, it's enough to
|
||||
understand the general idea behind routing keys.
|
||||
|
||||
# Exchange types
|
||||
|
||||
There are three primary types of exchanges that you'll see.
|
||||
|
||||
## Direct
|
||||
|
||||
A direct exchange is one that is probably most familiar to people. Message
|
||||
comes in and, assuming there is a queue bound, the message is picked up. You
|
||||
can have multiple queues bound to the same direct exchange. The best way to
|
||||
understand this pattern is pool of workers (queues) that read from a direct
|
||||
exchange to get units of work. Only one consumer will see a given message in a
|
||||
direct exchange.
|
||||
|
||||
You can set routing keys on messages published to a direct exchange. This
|
||||
allows you do have workers that do different tasks read from the same global
|
||||
pool of messages yet consume only the ones they know how to handle.
|
||||
|
||||
The RabbitMQ concepts guide (linked below) does a good job of describing this
|
||||
visually
|
||||
[here](http://www.rabbitmq.com/img/tutorials/intro/exchange-direct.png)
|
||||
|
||||
## Fanout
|
||||
|
||||
Fanouts are another type of exchange. Unlike direct exchanges, every queue
|
||||
bound to a fanout exchange will see the same messages. This is best described
|
||||
as a PUB/SUB pattern. This is helpful when you need broadcast messages to
|
||||
multiple interested parties.
|
||||
|
||||
Fanout exchanges do NOT support routing keys. All bound queues see all
|
||||
messages.
|
||||
|
||||
## Topic
|
||||
|
||||
Topic exchanges are special type of fanout exchange. Fanout exchanges don't
|
||||
support routing keys. Topic exchanges do support them. Just like a fanout
|
||||
exchange, all bound queues see all messages with the additional filter of the
|
||||
routing key.
|
||||
|
||||
# RabbitMQ in logstash
|
||||
|
||||
As stated earlier, in Logstash, Outputs publish to Exchanges. Inputs read from
|
||||
Queues that are bound to Exchanges. Logstash uses the `bunny` RabbitMQ library for
|
||||
interaction with a broker. Logstash endeavors to expose as much of the
|
||||
configuration for both exchanges and queues. There are many different tunables
|
||||
that you might be concerned with setting - including things like message
|
||||
durability or persistence of declared queues/exchanges. See the relevant input
|
||||
and output documentation for RabbitMQ for a full list of tunables.
|
||||
|
||||
# Sample configurations, tips, tricks and gotchas
|
||||
|
||||
There are several examples in the logstash source directory of RabbitMQ usage,
|
||||
however a few general rules might help eliminate any issues.
|
||||
|
||||
## Check your bindings
|
||||
|
||||
If logstash is publishing the messages and logstash is consuming the messages,
|
||||
the `exchange` value for the input should match the `name` in the output.
|
||||
|
||||
sender agent
|
||||
|
||||
input { stdin { type = "test" } }
|
||||
output {
|
||||
rabbitmq {
|
||||
exchange => "test_exchange"
|
||||
host => "my_rabbitmq_server"
|
||||
exchange_type => "fanout"
|
||||
}
|
||||
}
|
||||
|
||||
receiver agent
|
||||
|
||||
input {
|
||||
rabbitmq {
|
||||
queue => "test_queue"
|
||||
host => "my_rabbitmq_server"
|
||||
exchange => "test_exchange" # This matches the exchange declared above
|
||||
}
|
||||
}
|
||||
output { stdout { debug => true }}
|
||||
|
||||
## Message persistence
|
||||
|
||||
By default, logstash will attempt to ensure that you don't lose any messages.
|
||||
This is reflected in the RabbitMQ default settings as well. However there are
|
||||
cases where you might not want this. A good example is where RabbitMQ is not your
|
||||
primary method of shipping.
|
||||
|
||||
In the following example, we use RabbitMQ as a sniffing interface. Our primary
|
||||
destination is the embedded ElasticSearch instance. We have a secondary RabbitMQ
|
||||
output that we use for duplicating messages. However we disable persistence and
|
||||
durability on this interface so that messages don't pile up waiting for
|
||||
delivery. We only use RabbitMQ when we want to watch messages in realtime.
|
||||
Additionally, we're going to leverage routing keys so that we can optionally
|
||||
filter incoming messages to subsets of hosts. The exercise of getting messages
|
||||
to this logstash agent are left up to the user.
|
||||
|
||||
input {
|
||||
# some input definition here
|
||||
}
|
||||
|
||||
output {
|
||||
elasticsearch { embedded => true }
|
||||
rabbitmq {
|
||||
exchange => "logtail"
|
||||
host => "my_rabbitmq_server"
|
||||
exchange_type => "topic" # We use topic here to enable pub/sub with routing keys
|
||||
key => "logs.%{host}"
|
||||
durable => false # If rabbitmq restarts, the exchange disappears.
|
||||
auto_delete => true # If logstash disconnects, the exchange goes away
|
||||
persistent => false # Messages are not persisted to disk
|
||||
}
|
||||
}
|
||||
|
||||
Now if you want to stream logs in realtime, you can use the programming
|
||||
language of your choice to bind a queue to the `logtail` exchange. If you do
|
||||
not specify a routing key, you will see every message that comes in to
|
||||
logstash. However, you can specify a routing key like `logs.apache1` and see
|
||||
only messages from host `apache1`.
|
||||
|
||||
Note that any logstash variable is valid in the key definition. This allows you
|
||||
to create really complex routing key hierarchies for advanced filtering.
|
||||
|
||||
Note that RabbitMQ has specific rules about durability and persistence matching
|
||||
on both the queue and exchange. You should read the RabbitMQ documentation to
|
||||
make sure you don't crash your RabbitMQ server with messages awaiting someone
|
||||
to pick them up.
|
Before Width: | Height: | Size: 31 KiB |
|
@ -1,84 +0,0 @@
|
|||
---
|
||||
title: Metrics from Logs - logstash
|
||||
layout: content_right
|
||||
---
|
||||
# Pull metrics from logs
|
||||
|
||||
Logs are more than just text. How many customers signed up today? How many HTTP
|
||||
errors happened this week? When was your last puppet run?
|
||||
|
||||
Apache logs give you the http response code and bytes sent - that's useful in a
|
||||
graph. Metrics occur in logs so frequently there are piles of tools available to
|
||||
help process them.
|
||||
|
||||
Logstash can help (and even replace some tools you might already be using).
|
||||
|
||||
## Example: Replacing Etsy's Logster
|
||||
|
||||
[Etsy](https://github.com/etsy) has some excellent open source tools. One of
|
||||
them, [logster](https://github.com/etsy/logster), is meant to help you pull
|
||||
metrics from logs and ship them to [graphite](http://graphite.wikidot.com/) so
|
||||
you can make pretty graphs of those metrics.
|
||||
|
||||
One sample logster parser is one that pulls http response codes out of your
|
||||
apache logs: [SampleLogster.py](https://github.com/etsy/logster/blob/master/logster/parsers/SampleLogster.py)
|
||||
|
||||
The above code is roughly 50 lines of python and only solves one specific
|
||||
problem in only apache logs: count http response codes by major number (1xx,
|
||||
2xx, 3xx, etc). To be completely fair, you could shrink the code required for
|
||||
a Logster parser, but size is not strictly the point, here.
|
||||
|
||||
## Keep it simple
|
||||
|
||||
Logstash can do more than the above, simpler, and without much coding skill:
|
||||
|
||||
input {
|
||||
file {
|
||||
path => "/var/log/apache/access.log"
|
||||
type => "apache-access"
|
||||
}
|
||||
}
|
||||
|
||||
filter {
|
||||
grok {
|
||||
type => "apache-access"
|
||||
pattern => "%{COMBINEDAPACHELOG}"
|
||||
}
|
||||
}
|
||||
|
||||
output {
|
||||
statsd {
|
||||
# Count one hit every event by response
|
||||
increment => "apache.response.%{response}"
|
||||
}
|
||||
}
|
||||
|
||||
The above uses grok to parse fields out of apache logs and using the statsd
|
||||
output to increment counters based on the response code. Of course, now that we
|
||||
are parsing apache logs fully, we can trivially add additional metrics:
|
||||
|
||||
output {
|
||||
statsd {
|
||||
# Count one hit every event by response
|
||||
increment => "apache.response.%{response}"
|
||||
|
||||
# Use the 'bytes' field from the apache log as the count value.
|
||||
count => [ "apache.bytes", "%{bytes}" ]
|
||||
}
|
||||
}
|
||||
|
||||
Now adding additional metrics is just one more line in your logstash config
|
||||
file. BTW, the 'statsd' output writes to another Etsy tool,
|
||||
[statsd](https://github.com/etsy/statsd), which helps build counters/latency
|
||||
data and ship it to graphite for graphing.
|
||||
|
||||
Using the logstash config above and a bunch of apache access requests, you might end up
|
||||
with a graph that looks like this:
|
||||
|
||||

|
||||
|
||||
The point made above is not "logstash is better than Logster" - the point is
|
||||
that logstash is a general-purpose log management and pipelining tool and that
|
||||
while you can centralize logs with logstash, you can read, modify, and write
|
||||
them to and from just about anywhere.
|
||||
|
|
@ -1,118 +0,0 @@
|
|||
---
|
||||
title: ZeroMQ - logstash
|
||||
layout: content_right
|
||||
---
|
||||
|
||||
*ZeroMQ support in Logstash is currently in an experimental phase. As such, parts of this document are subject to change.*
|
||||
|
||||
# ZeroMQ
|
||||
Simply put ZeroMQ (0mq) is a socket on steroids. This makes it a perfect compliment to Logstash - a pipe on steroids.
|
||||
|
||||
ZeroMQ allows you to easily create sockets of various types for moving data around. These sockets are refered to in ZeroMQ by the behavior of each side of the socket pair:
|
||||
|
||||
* PUSH/PULL
|
||||
* REQ/REP
|
||||
* PUB/SUB
|
||||
* ROUTER/DEALER
|
||||
|
||||
There is also a `PAIR` socket type as well.
|
||||
|
||||
Additionally, the socket type is independent of the connection method. A PUB/SUB socket pair could have the SUB side of the socket be a listener and the PUB side a connecting client. This makes it very easy to fit ZeroMQ into various firewalled architectures.
|
||||
|
||||
Note that this is not a full-fledged tutorial on ZeroMQ. It is a tutorial on how Logstash uses ZeroMQ.
|
||||
|
||||
# ZeroMQ and logstash
|
||||
In the spirit of ZeroMQ, Logstash takes these socket type pairs and uses them to create topologies with some very simply rules that make usage very easy to understand:
|
||||
|
||||
* The receiving end of a socket pair is always a logstash input
|
||||
* The sending end of a socket pair is always a logstash output
|
||||
* By default, inputs `bind`/listen and outputs `connect`
|
||||
* Logstash refers to the socket pairs as topologies and mirrors the naming scheme from ZeroMQ
|
||||
* By default, ZeroMQ inputs listen on all interfaces on port 2120, ZeroMQ outputs connect to `localhost` on port 2120
|
||||
|
||||
The currently understood Logstash topologies for ZeroMQ inputs and outputs are:
|
||||
|
||||
* `pushpull`
|
||||
* `pubsub`
|
||||
* `pair`
|
||||
|
||||
We have found from various discussions that these three topologies will cover most of user's needs. We hope to expose the full span of ZeroMQ socket types as time goes on.
|
||||
|
||||
By keeping the options simple, this allows you to get started VERY easily with what are normally complex message flows. No more confusion over `exchanges` and `queues` and `brokers`. If you need to add fanout capability to your flow, you can simply use the following configs:
|
||||
|
||||
* _node agent lives at 192.168.1.2_
|
||||
* _indexer agent lives at 192.168.1.1_
|
||||
|
||||
# Node agent config
|
||||
input { stdin { type => "test-stdin-input" } }
|
||||
output { zeromq { topology => "pubsub" address => "tcp://192.168.1.1.:2120" } }
|
||||
|
||||
# Indexer agent config
|
||||
input { zeromq { topology => "pubsub" } }
|
||||
output { stdout { debug => true }}
|
||||
|
||||
If for some reason you need connections to initiate from the indexer because of firewall rules:
|
||||
|
||||
# Node agent config - now listening on all interfaces port 2120
|
||||
input { stdin { type => "test-stdin-input" } }
|
||||
output { zeromq { topology => "pubsub" address => "tcp://*.:2120" mode => "server" } }
|
||||
|
||||
# Indexer agent config
|
||||
input { zeromq { topology => "pubsub" address => "tcp://192.168.1.2" mode => "client" } }
|
||||
output { stdout { debug => true }}
|
||||
|
||||
As stated above, by default `inputs` always start as listeners and `outputs` always start as initiators. Please don't confuse what happens once the socket is connect with the direction of the connection. ZeroMQ separates connection from topology. In the second case of the above configs, once the two sockets are connected, regardless of who initiated the connection, the message flow itself is absolute. The indexer is reading events from the node.
|
||||
|
||||
# Which topology to use
|
||||
The choice of topology can be broken down very easily based on need
|
||||
|
||||
## one to one
|
||||
Use `pair` topology. On the output side, specify the ipaddress and port of the input side.
|
||||
|
||||
## broadcast
|
||||
Use `pubsub`
|
||||
If you need to broadcast ALL messages to multiple hosts that each need to see all events, use `pubsub`. Note that all events are broadcast to all subscribers. When using `pubsub` you might also want to investigate the `topic` configuration option which allows subscribers to see only a subset of messages.
|
||||
|
||||
## Filter workers
|
||||
Use `pushpull`
|
||||
In `pushpull`, ZeroMQ automatically load balances to all connected peers. This means that no peer sees the same message as any other peer.
|
||||
|
||||
# What's with the address format?
|
||||
ZeroMQ supports multiple types of transports:
|
||||
|
||||
* inproc:// (unsupported by logstash due to threading)
|
||||
* tcp:// (exactly what it sounds like)
|
||||
* ipc:// (probably useless in logstash)
|
||||
* pgm:// and epgm:// (a multicast format - only usable with PUB and SUB socket types)
|
||||
|
||||
For pretty much all cases, you'll be using `tcp://` transports with Logstash.
|
||||
|
||||
## Topic - applies to `pubsub`
|
||||
This opt mimics the routing keys functionality in AMQP. Imagine you have a network of receivers but only a subset of the messages need to be seen by a subset of the hosts. You can use this option as a routing key to facilite that:
|
||||
|
||||
# This output is a PUB
|
||||
output {
|
||||
zeromq { topology => "pubsub" topic => "logs.production.%{host}" }
|
||||
}
|
||||
|
||||
# This input is a SUB
|
||||
# I only care about db1 logs
|
||||
input { zeromq { type => "db1logs" address => "tcp://<ipaddress>:2120" topic => "logs.production.db1"}}
|
||||
|
||||
One thing important to note about 0mq PUBSUB and topics is that all filtering is done on the subscriber side. The subscriber will get ALL messages but discard any that don't match the topic.
|
||||
|
||||
Also important to note is that 0mq doesn't do topic in the same sense as an AMQP broker might. When a SUB socket gets a message, it compares the first bytes of the message against the topic. However, this isn't always flexible depending on the format of your message. The common practice then, is to send a 0mq multipart message and make the first part the topic. The next parts become the actual message body.
|
||||
|
||||
This is approach is how logstash handles this. When using PUBSUB, Logstash will send a multipart message where the first part is the name of the topic and the second part is the event. This is important to know if you are sending to a SUB input from sources other than Logstash.
|
||||
|
||||
# sockopts
|
||||
Sockopts is not you choosing between blue or black socks. ZeroMQ supports setting various flags or options on sockets. In the interest of minimizing configuration syntax, these are _hidden_ behind a logstash configuration element called `sockopts`. You probably won't need to tune these for most cases. If you do need to tune them, you'll probably set the following:
|
||||
|
||||
## ZMQ::HWM - sets the high water mark
|
||||
The high water mark is the maximum number of messages a given socket pair can have in its internal queue. Use this to throttle essentially.
|
||||
|
||||
## ZMQ::SWAP_SIZE
|
||||
TODO
|
||||
|
||||
## ZMQ::IDENTITY
|
||||
TODO
|