groking nested information in logstash

3k Views Asked by At

My log events look like this:

WARN 12 Dec 00:11:12 slow:[Task[Task-Name 20 ms],Task[Task-Name 30 ms]], time = 1234

This means that some slow tasks which may be nested are logged.

Is there any chance to create as many fields I need (= number of tasks which differs at each event logged) by an grok filter?

Or should I have to write my own filter and at least how can I access these fields?

1

There are 1 best solutions below

0
On

I've found that Logstash runs into trouble when parsing less-structured logs like this.

You could extract the tasks substring using grok with a custom capture pattern, then convert it to an array of substrings with mutate's split option:

filter {
    grok {
        match => ["message", "slow:\[(?<tasks>(?:Task\[%{USER} %{NUMBER} ms\],?)+)\]"]
    }
    mutate {
        split => ["tasks", ","]
    }
}

Trouble after that is that there's no straightforward way to grok through the resulting array of tasks. Depending on what you need the data for, that might be OK.

If you actually want to parse the fields out of those tasks, the only thing I can think of is to provide grok with multiple match candidates, like so:

filter {
    grok {
        match => [
            "message", "slow:\[Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\]\]",
            "message", "slow:\[Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\]\]",
            "message", "slow:\[Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\]\]",
            "message", "slow:\[Task\[%{USER:name} %{NUMBER:time} ms\],Task\[%{USER:name} %{NUMBER:time} ms\]\]",            
            "message", "slow:\[Task\[%{USER:name} %{NUMBER:time} ms\]\]"
        ]
    }
}

Since grok will normally break on match, it will stop processing after the first pattern that matches. This would give you parallel arrays, name and time.