The multiline codec will collapse multiline messages and merge them into a single event.
The original goal of this codec was to allow joining of multiline messages from files into a single event. For example, joining Java exception and stacktrace messages into a single event.
The config looks like this:
input {
stdin {
codec => multiline {
pattern => "pattern, a regexp"
negate => "true" or "false"
what => "previous" or "next"
}
}
}
The pattern
should match what you believe to be an indicator that the field
is part of a multi-line event.
The what
must be “previous” or “next” and indicates the relation
to the multi-line event.
The negate
can be “true” or “false” (defaults to “false”). If “true”, a
message not matching the pattern will constitute a match of the multiline
filter and the what
will be applied. (vice-versa is also true)
For example, Java stack traces are multiline and usually have the message starting at the far-left, with each subsequent line indented. Do this:
input {
stdin {
codec => multiline {
pattern => "^\s"
what => "previous"
}
}
}
This says that any line starting with whitespace belongs to the previous line.
Another example is to merge lines not starting with a date up to the previous line..
input {
file {
path => "/var/log/someapp.log"
codec => multiline {
# Grok pattern names are valid! :)
pattern => "^%{TIMESTAMP_ISO8601} "
negate => true
what => previous
}
}
}
This says that any line not starting with a timestamp should be merged with the previous line.
One more common example is C line continuations (backslash). Here’s how to do that:
filter {
multiline {
type => "somefiletype"
pattern => "\\$"
what => "next"
}
}
This says that any line ending with a backslash should be combined with the following line.
# with an input plugin:
# you can also use this codec with an output.
input {
file {
codec => multiline {
charset => ... # string, one of ["ASCII-8BIT", "Big5", "Big5-HKSCS", "Big5-UAO", "CP949", "Emacs-Mule", "EUC-JP", "EUC-KR", "EUC-TW", "GB18030", "GBK", "ISO-8859-1", "ISO-8859-2", "ISO-8859-3", "ISO-8859-4", "ISO-8859-5", "ISO-8859-6", "ISO-8859-7", "ISO-8859-8", "ISO-8859-9", "ISO-8859-10", "ISO-8859-11", "ISO-8859-13", "ISO-8859-14", "ISO-8859-15", "ISO-8859-16", "KOI8-R", "KOI8-U", "Shift_JIS", "US-ASCII", "UTF-8", "UTF-16BE", "UTF-16LE", "UTF-32BE", "UTF-32LE", "Windows-1251", "GB2312", "IBM437", "IBM737", "IBM775", "CP850", "IBM852", "CP852", "IBM855", "CP855", "IBM857", "IBM860", "IBM861", "IBM862", "IBM863", "IBM864", "IBM865", "IBM866", "IBM869", "Windows-1258", "GB1988", "macCentEuro", "macCroatian", "macCyrillic", "macGreek", "macIceland", "macRoman", "macRomania", "macThai", "macTurkish", "macUkraine", "CP950", "CP951", "stateless-ISO-2022-JP", "eucJP-ms", "CP51932", "GB12345", "ISO-2022-JP", "ISO-2022-JP-2", "CP50220", "CP50221", "Windows-1252", "Windows-1250", "Windows-1256", "Windows-1253", "Windows-1255", "Windows-1254", "TIS-620", "Windows-874", "Windows-1257", "Windows-31J", "MacJapanese", "UTF-7", "UTF8-MAC", "UTF-16", "UTF-32", "UTF8-DoCoMo", "SJIS-DoCoMo", "UTF8-KDDI", "SJIS-KDDI", "ISO-2022-JP-KDDI", "stateless-ISO-2022-JP-KDDI", "UTF8-SoftBank", "SJIS-SoftBank", "BINARY", "CP437", "CP737", "CP775", "IBM850", "CP857", "CP860", "CP861", "CP862", "CP863", "CP864", "CP865", "CP866", "CP869", "CP1258", "Big5-HKSCS:2008", "eucJP", "euc-jp-ms", "eucKR", "eucTW", "EUC-CN", "eucCN", "CP936", "ISO2022-JP", "ISO2022-JP2", "ISO8859-1", "CP1252", "ISO8859-2", "CP1250", "ISO8859-3", "ISO8859-4", "ISO8859-5", "ISO8859-6", "CP1256", "ISO8859-7", "CP1253", "ISO8859-8", "CP1255", "ISO8859-9", "CP1254", "ISO8859-10", "ISO8859-11", "CP874", "ISO8859-13", "CP1257", "ISO8859-14", "ISO8859-15", "ISO8859-16", "CP878", "CP932", "csWindows31J", "SJIS", "PCK", "MacJapan", "ASCII", "ANSI_X3.4-1968", "646", "CP65000", "CP65001", "UTF-8-MAC", "UTF-8-HFS", "UCS-2BE", "UCS-4BE", "UCS-4LE", "CP1251", "external", "locale"] (optional), default: "UTF-8"
multiline_tag => ... # string (optional), default: "multiline"
negate => ... # boolean (optional), default: false
pattern => ... # string (required)
patterns_dir => ... # array (optional), default: []
what => ... # string, one of ["previous", "next"] (required)
}
}
}
The character encoding used in this input. Examples include “UTF-8” and “cp1252”
This setting is useful if your log files are in Latin-1 (aka cp1252) or in another character set other than UTF-8.
This only affects “plain” format logs since JSON is UTF-8 already.
Tag multiline events with a given tag. This tag will only be added to events that actually have multiple lines in them.
Negate the regexp pattern (‘if not matched’).
The regular expression to match.
Logstash ships by default with a bunch of patterns, so you don’t necessarily need to define this yourself unless you are adding additional patterns.
Pattern files are plain text with format:
NAME PATTERN
For example:
NUMBER \d+
If the pattern matched, does event belong to the next or previous event?