mirror of
https://github.com/elastic/logstash.git
synced 2025-04-24 22:57:16 -04:00
217 lines
9.3 KiB
Text
217 lines
9.3 KiB
Text
[[dead-letter-queues]]
|
|
=== Dead Letter Queues
|
|
|
|
NOTE: The dead letter queue feature is currently supported for the
|
|
<<plugins-outputs-elasticsearch>> output only. Support for additional outputs
|
|
will be available in future releases of the Logstash plugins. Before configuring
|
|
Logstash to use this feature, refer to the output plugin documentation to
|
|
verify that the plugin supports the dead letter queue feature.
|
|
|
|
By default, when Logstash encounters an event that it cannot process because the
|
|
data contains a mapping error or some other issue, the Logstash pipeline
|
|
either hangs or drops the unsuccessful event. In order to protect against data
|
|
loss in this situation, you can <<configuring-dlq,configure Logstash>> to write
|
|
unsuccessful events to a dead letter queue instead of dropping them.
|
|
|
|
Each event written to the dead letter queue includes the original event, along
|
|
with metadata that describes the reason the event could not be processed,
|
|
information about the plugin that wrote the event, and the timestamp for when
|
|
the event entered the dead letter queue.
|
|
|
|
To process events in the dead letter queue, you simply create a Logstash
|
|
pipeline configuration that uses the
|
|
<<plugins-inputs-dead_letter_queue,`dead_letter_queue` input plugin>> to read
|
|
from the queue.
|
|
|
|
image::static/images/dead_letter_queue.png[Diagram showing pipeline reading from the dead letter queue]
|
|
|
|
See <<processing-dlq-events>> for more information.
|
|
|
|
[[configuring-dlq]]
|
|
==== Configuring Logstash to Use Dead Letter Queues
|
|
|
|
Dead letter queues are disabled by default. To enable dead letter queues, set
|
|
the `dead_letter_queue_enable` option in the `logstash.yml`
|
|
<<logstash-settings-file,settings file>>:
|
|
|
|
[source,yaml]
|
|
-------------------------------------------------------------------------------
|
|
dead_letter_queue.enable: true
|
|
-------------------------------------------------------------------------------
|
|
|
|
Dead letter queues are stored as files in the local directory of the Logstash
|
|
instance. By default, the dead letter queue files are stored in
|
|
`path.data/dead_letter_queue`. Each pipeline has a separate queue. For example,
|
|
the dead letter queue for the `main` pipeline is stored in
|
|
`LOGSTASH_HOME/data/dead_letter_queue/main` by default. The queue files are
|
|
numbered sequentially: `1.log`, `2.log`, and so on.
|
|
|
|
You can set `path.dead_letter_queue` in the `logstash.yml` file to
|
|
specify a different path for the files:
|
|
|
|
[source,yaml]
|
|
-------------------------------------------------------------------------------
|
|
path.dead_letter_queue: "path/to/data/dead_letter_queue"
|
|
-------------------------------------------------------------------------------
|
|
|
|
|
|
NOTE: You may not use the same `dead_letter_queue` path for two different
|
|
Logstash instances.
|
|
|
|
===== File Rotation
|
|
|
|
Dead letter queues have a built-in file rotation policy that manages the file
|
|
size of the queue. When the file size reaches a preconfigured threshold, a new
|
|
file is created automatically.
|
|
|
|
By default, the maximum size of each dead letter queue is set to 1024mb. To
|
|
change this setting, use the `dead_letter_queue.max_bytes` option. Entries
|
|
will be dropped if they would increase the size of the dead letter queue beyond
|
|
this setting.
|
|
|
|
[[processing-dlq-events]]
|
|
==== Processing Events in the Dead Letter Queue
|
|
|
|
When you are ready to process events in the dead letter queue, you create a
|
|
pipeline that uses the
|
|
<<plugins-inputs-dead_letter_queue,`dead_letter_queue` input plugin>> to read
|
|
from the dead letter queue. The pipeline configuration that you use depends, of
|
|
course, on what you need to do. For example, if the dead letter queue contains
|
|
events that resulted from a mapping error in Elasticsearch, you can create a
|
|
pipeline that reads the "dead" events, removes the field that caused the mapping
|
|
issue, and re-indexes the clean events into Elasticsearch.
|
|
|
|
The following example shows a simple pipeline that reads events from the dead
|
|
letter queue and writes the events, including metadata, to standard output:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------------------------------------
|
|
input {
|
|
dead_letter_queue {
|
|
path => "/path/to/data/dead_letter_queue" <1>
|
|
commit_offsets => true <2>
|
|
pipeline_id => "main" <3>
|
|
}
|
|
}
|
|
|
|
output {
|
|
stdout {
|
|
codec => rubydebug { metadata => true }
|
|
}
|
|
}
|
|
--------------------------------------------------------------------------------
|
|
|
|
<1> The path to the top-level directory containing the dead letter queue. This
|
|
directory contains a separate folder for each pipeline that writes to the dead
|
|
letter queue. To find the path to this directory, look at the `logstash.yml`
|
|
<<logstash-settings-file,settings file>>. By default, Logstash creates the
|
|
`dead_letter_queue` directory under the location used for persistent
|
|
storage (`path.data`), for example, `LOGSTASH_HOME/data/dead_letter_queue`.
|
|
However, if `path.dead_letter_queue` is set, it uses that location instead.
|
|
<2> When `true`, saves the offset. When the pipeline restarts, it will continue
|
|
reading from the position where it left off rather than reprocessing all the
|
|
items in the queue. You can set `commit_offsets` to `false` when you are
|
|
exploring events in the dead letter queue and want to iterate over the events
|
|
multiple times.
|
|
<3> The ID of the pipeline that's writing to the dead letter queue. The default
|
|
is `"main"`.
|
|
|
|
For another example, see <<dlq-example>>.
|
|
|
|
When the pipeline has finished processing all the events in the dead letter
|
|
queue, it will continue to run and process new events as they stream into the
|
|
queue. This means that you do not need to stop your production system to handle
|
|
events in the dead letter queue.
|
|
|
|
NOTE: Events emitted from the
|
|
<<plugins-inputs-dead_letter_queue,`dead_letter_queue` input plugin>> plugin
|
|
will not be resubmitted to the dead letter queue if they cannot be processed
|
|
correctly.
|
|
|
|
[[dlq-timestamp]]
|
|
==== Reading From a Timestamp
|
|
|
|
When you read from the dead letter queue, you might not want to process all the
|
|
events in the queue, especially if there are a lot of old events in the queue.
|
|
You can start processing events at a specific point in the queue by using the
|
|
`start_timestamp` option. This option configures the pipeline to start
|
|
processing events based on the timestamp of when they entered the queue:
|
|
|
|
[source,yaml]
|
|
--------------------------------------------------------------------------------
|
|
input {
|
|
dead_letter_queue {
|
|
path => "/path/to/data/dead_letter_queue"
|
|
start_timestamp => 2017-06-06T23:40:37
|
|
pipeline_id => "main"
|
|
}
|
|
}
|
|
--------------------------------------------------------------------------------
|
|
|
|
For this example, the pipeline starts reading all events that were delivered to
|
|
the dead letter queue on or after June 6, 2017, at 23:40:37.
|
|
|
|
[[dlq-example]]
|
|
==== Example: Processing Data That Has Mapping Errors
|
|
|
|
In this example, the user attempts to index a document that includes geo_ip data,
|
|
but the data cannot be processed because it contains a mapping error:
|
|
|
|
[source,json]
|
|
--------------------------------------------------------------------------------
|
|
{"geoip":{"location":"home"}}
|
|
--------------------------------------------------------------------------------
|
|
|
|
Indexing fails because the Logstash output plugin expects a `geo_point` object in
|
|
the `location` field, but the value is a string. The failed event is written to
|
|
the dead letter queue, along with metadata about the error that caused the
|
|
failure:
|
|
|
|
[source,json]
|
|
--------------------------------------------------------------------------------
|
|
{
|
|
"@metadata" => {
|
|
"dead_letter_queue" => {
|
|
"entry_time" => #<Java::OrgLogstash::Timestamp:0x5b5dacd5>,
|
|
"plugin_id" => "fb80f1925088497215b8d037e622dec5819b503e-4",
|
|
"plugin_type" => "elasticsearch",
|
|
"reason" => "Could not index event to Elasticsearch. status: 400, action: [\"index\", {:_id=>nil, :_index=>\"logstash-2017.06.22\", :_type=>\"logs\", :_routing=>nil}, 2017-06-22T01:29:29.804Z Suyogs-MacBook-Pro-2.local {\"geoip\":{\"location\":\"home\"}}], response: {\"index\"=>{\"_index\"=>\"logstash-2017.06.22\", \"_type\"=>\"logs\", \"_id\"=>\"AVzNayPze1iR9yDdI2MD\", \"status\"=>400, \"error\"=>{\"type\"=>\"mapper_parsing_exception\", \"reason\"=>\"failed to parse\", \"caused_by\"=>{\"type\"=>\"illegal_argument_exception\", \"reason\"=>\"illegal latitude value [266.30859375] for geoip.location\"}}}}"
|
|
}
|
|
},
|
|
"@timestamp" => 2017-06-22T01:29:29.804Z,
|
|
"@version" => "1",
|
|
"geoip" => {
|
|
"location" => "home"
|
|
},
|
|
"host" => "Suyogs-MacBook-Pro-2.local",
|
|
"message" => "{\"geoip\":{\"location\":\"home\"}}"
|
|
}
|
|
--------------------------------------------------------------------------------
|
|
|
|
To process the failed event, you create the following pipeline that reads from
|
|
the dead letter queue and removes the mapping problem:
|
|
|
|
[source,json]
|
|
--------------------------------------------------------------------------------
|
|
input {
|
|
dead_letter_queue {
|
|
path => "/path/to/data/dead_letter_queue/" <1>
|
|
}
|
|
}
|
|
filter {
|
|
mutate {
|
|
remove_field => "[geoip][location]" <2>
|
|
}
|
|
}
|
|
output {
|
|
elasticsearch{
|
|
hosts => [ "localhost:9200" ] <3>
|
|
}
|
|
}
|
|
--------------------------------------------------------------------------------
|
|
|
|
<1> The <<plugins-inputs-dead_letter_queue,`dead_letter_queue` input>> reads from the dead letter queue.
|
|
<2> The `mutate` filter removes the problem field called `location`.
|
|
<3> The clean event is sent to Elasticsearch, where it can be indexed because
|
|
the mapping issue is resolved.
|
|
|