With logstash how to combine lines starting with timestamp

1.2k Views Asked by At

Here is my sample log file which i need to parse using logstash:

2016-12-27 07:54:38.621 8407 ERROR oslo_service.service Traceback (most recent call last):
2016-12-27 07:54:38.621 8407 ERROR oslo_service.service   File "/usr/lib/python2.7/dist-packages/oslo_service/service.py", line 680, in run_service
2016-12-27 07:54:38.621 8407 ERROR oslo_service.service     service.start()
2016-12-27 07:54:38.621 8407 ERROR oslo_service.service   File "/usr/lib/python2.7/dist-packages/nova/service.py", line 428, in start
2016-12-27 07:54:38.621 8407 ERROR oslo_service.service     self.binary)
2016-12-27 07:54:38.621 8407 ERROR oslo_service.service   File "/usr/lib/python2.7/dist-packages/oslo_versionedobjects/base.py", line 181, in wrapper

Please give me some suggestions how can i parse logs of this format using grok multiline filter and what pattern should i use for this.

Thank you in advance !!

2

There are 2 best solutions below

0
On

What if you try something like this using grok and multiline:

input {
 file {
    path => [""] <-- path to your log directory
    start_position => "beginning"
    codec => multiline {
                   pattern => "^%{TIMESTAMP_ISO8601}"
                   negate => true
                   what => previous
    }       
  }
}
filter {
     grok {
        patterns_dir => "./patterns" <-- the path to the patterns file
        match=>["message","%{TIMESTAMP_ISO8601:timestamp} %{WORD:level} %{GREEDYDATA:content}"]
     }
}

The above is just a sample, you could reproduce it as you wish.

Multiline is the pattern used to read the data, appends all lines that begin with a whitespace, to the previous line. In other words, when Logstash reads a line of input that begins with a whitespace (space, tab), that line will be merged with the previously read input information.

This SO might be helpful as well. Hope it helps!

0
On

There are two way to implement

  • Mutipleline in logstash input

This could use mutiple thread to parsing the log.

input {
     beats {
        port => 5044
        codec => multiline {
        pattern => "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}[\.,][0-9]{3,7} "
        negate => true
        what => "previous"
        }
        congestion_threshold  => 40
       }
    }
  • Mutipleline in logstash filter

can't use muti thread , but could by special case to setting the filter

 filter {
        if  [@metadata][beat] =~ "xxxx"   {
              multiline {
               pattern => "^%{TIMESTAMP_ISO8601}"
               negate => true
               what => "previous"
            }
      }
    }