Table of Contents
Encoding fallback
Note: This is known only to work in logstash >= 1.4
This scenario is for when you are trying to dynamically split query parameters from requests into multiple values.
However they are URI encoded, and you cant really be sure if they contain UTF-8 or ISO-8859-1 encoded strings, as they can come from the wild internet.
The kv
plugin is wise to only accept UTF-8 strings, but if you are not certain that the uridecode
filter gives you a UTF-8 string, its possible to use the ruby
filter to conditionally convert them from ISO-8859-1 to UTF-8, thus giving some peace to the kv
plugin.
# break down the query_string:
if [http_request_querystr] {
mutate {
add_field => [ "http_request_querystr_decoded", "%{http_request_querystr}" ]
}
urldecode {
field => "http_request_querystr_decoded"
}
# if its not UTF-8 encoded, we force encoding from ISO-8859-1 to UTF-8:
ruby {
code => 'if ! event["http_request_querystr_decoded"].valid_encoding? ; event["http_request_querystr_decoded"] = event["http_request_querystr_decoded"].encode("UTF-8", "ISO-8859-1", :invalid => :replace, :undef => :replace ) ; end'
}
kv {
field_split => "&?"
target => "http_request_querystr_params"
source => "http_request_querystr_decoded"
}
mutate {
remove_field => [ "http_request_querystr_decoded" ]
}
}
Note2: if the original string is neither ISO-8859-1 or UTF-8, then you will end up with a valid UTF-8 strings, that will likely garbage for the chars that it can't safely fall back to.
Hello! I'm your friendly footer. If you're actually reading this, I'm impressed. :)