Doc:Clarify how Bulk API interacts with DLQ (#12209) (#12268)

Documents both HTTP success and HTTP failure scenarios.
Co-authored-by: Ry Biesemeyer <ry.biesemeyer@elastic.co>

Backports #12209
This commit is contained in:
Karen Metts 2020-09-23 14:10:18 -04:00 committed by GitHub
parent 9472d89363
commit b287ec72e2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -1,37 +1,52 @@
[[dead-letter-queues]]
=== Dead Letter Queues
=== Dead Letter Queues (DLQ)
NOTE: The dead letter queue feature is currently supported for the
<<plugins-outputs-elasticsearch>> output only. Additionally, The dead
letter queue is only used where the response code is either 400
or 404, both of which indicate an event that cannot be retried.
Support for additional outputs will be available in future releases of the
Logstash plugins. Before configuring Logstash to use this feature, refer to
the output plugin documentation to verify that the plugin supports the dead
letter queue feature.
The dead letter queue (DLQ) can provide another layer of data resilience.
By default, when Logstash encounters an event that it cannot process because the
data contains a mapping error or some other issue, the Logstash pipeline
either hangs or drops the unsuccessful event. In order to protect against data
loss in this situation, you can <<configuring-dlq,configure Logstash>> to write
unsuccessful events to a dead letter queue instead of dropping them.
unsuccessful events to a dead letter queue instead of dropping them.
Each event written to the dead letter queue includes the original event, along
with metadata that describes the reason the event could not be processed,
information about the plugin that wrote the event, and the timestamp for when
the event entered the dead letter queue.
NOTE: The dead letter queue is currently supported only for the
<<plugins-outputs-elasticsearch,{es} output>>. The dead letter queue is used for
documents with response codes of 400 or 404, both of which indicate an event
that cannot be retried.
To process events in the dead letter queue, you simply create a Logstash
pipeline configuration that uses the
Each event written to the dead letter queue includes the original event,
metadata that describes the reason the event could not be processed, information
about the plugin that wrote the event, and the timestamp when the event
entered the dead letter queue.
To process events in the dead letter queue, create a Logstash pipeline
configuration that uses the
<<plugins-inputs-dead_letter_queue,`dead_letter_queue` input plugin>> to read
from the queue.
from the queue. See <<processing-dlq-events>> for more information.
image::static/images/dead_letter_queue.png[Diagram showing pipeline reading from the dead letter queue]
See <<processing-dlq-events>> for more information.
[[es-proc-dlq]]
==== {es} processing and the dead letter queue
**HTTP request failure.** If the HTTP request fails (because {es} is unreachable
or because it returned an HTTP error code), the {es} output retries the entire
request indefinitely. In these scenarios, the dead letter queue has no
opportunity to intercept.
**HTTP request success.** The {ref}/docs-bulk.html[{es} Bulk API] can perform
multiple actions using the same request. If the Bulk API request is successful,
it returns `200 OK`, even if some documents in the batch have
{ref}/docs-bulk.html#bulk-failures-ex[failed]. In this situation, the `errors`
flag for the request will be `true`.
The response body can include metadata indicating that one or more specific
actions in the bulk request could not be performed, along with an HTTP-style
status code per entry to indicate why the action could not be performed.
If the DLQ is configured, individual indexing failures are routed there.
[[configuring-dlq]]
==== Configuring Logstash to Use Dead Letter Queues
==== Configuring {ls} to use dead letter queues
Dead letter queues are disabled by default. To enable dead letter queues, set
the `dead_letter_queue_enable` option in the `logstash.yml`
@ -61,7 +76,7 @@ path.dead_letter_queue: "path/to/data/dead_letter_queue"
NOTE: You may not use the same `dead_letter_queue` path for two different
Logstash instances.
===== File Rotation
===== File rotation
Dead letter queues have a built-in file rotation policy that manages the file
size of the queue. When the file size reaches a preconfigured threshold, a new
@ -73,7 +88,7 @@ will be dropped if they would increase the size of the dead letter queue beyond
this setting.
[[processing-dlq-events]]
==== Processing Events in the Dead Letter Queue
==== Processing events in the dead letter queue
When you are ready to process events in the dead letter queue, you create a
pipeline that uses the
@ -131,7 +146,7 @@ will not be resubmitted to the dead letter queue if they cannot be processed
correctly.
[[dlq-timestamp]]
==== Reading From a Timestamp
==== Reading from a timestamp
When you read from the dead letter queue, you might not want to process all the
events in the queue, especially if there are a lot of old events in the queue.
@ -154,7 +169,7 @@ For this example, the pipeline starts reading all events that were delivered to
the dead letter queue on or after June 6, 2017, at 23:40:37.
[[dlq-example]]
==== Example: Processing Data That Has Mapping Errors
==== Example: Processing data that has mapping errors
In this example, the user attempts to index a document that includes geo_ip data,
but the data cannot be processed because it contains a mapping error: