Grok pattern to retrieve Processor id from Nifi logs

624 Views Asked by At

Can anyone help me getting right Grok pattern to retrieve only id value from below sample logs(only message part) from nifi

  1. o.a.n.c.s.StandardControllerServiceNode StandardControllerServiceNode[service=DBCPConnectionPool[id=5609ac16-0174-1000-eeee-ffffd19aae44]
  2. o.a.n.c.s.StandardControllerServiceNode Failed to invoke @OnEnabled method of DBCPConnectionPool[id=5609ac16-0174-1000-hhhh-ffffd19aae44]
  3. o.a.n.c.s.StandardControllerServiceNode Failed to invoke @OnEnabled method of DBCPConnectionPool[id=5609ac16-0174-1000-gggg-ffffd19aae44]*

I have tried using the below pattern but it is retrieving the whole message but unable to get id value separately

  • %{TIMESTAMP_ISO8601:timestamp} %{LOGLEVEL:severity} [%{DATA:thread}] %{DATA:class} %{GREEDYDATA:message}
2

There are 2 best solutions below

0
On

You can use something like below that captures the UUID from the log.

filter {
   grok{
     match =>  { "message" => "%{TIMESTAMP_ISO8601:timestamp}%{SPACE}%{LOGLEVEL:severity}%{SPACE}\[%{DATA:thread}\]%{SPACE}%{DATA:class}%{SPACE}%{GREEDYDATA}\[id=%{UUID:id}\]%{GREEDYDATA}"} 
   }
}

It should give an output like below

{
        "thread" => "Curator-Framework-0",
            "id" => "5609ac16-0174-1000-eeee-ffffd19aae44",
    "@timestamp" => 2020-10-06T11:24:24.703Z,
          "path" => "/usr/share/logstash/stack/data/data.log",
     "timestamp" => "2016-08-04 13:26:35,475",
      "severity" => "DEBUG",
      "@version" => "1",
          "host" => "356136d6f0b4",
       "message" => "2016-08-04 13:26:35,475 DEBUG [Curator-Framework-0] o.a.n.c.s.StandardControllerServiceNode StandardControllerServiceNode[service=DBCPConnectionPool[id=5609ac16-0174-1000-eeee-ffffd19aae44]"
}

given that the sample log entry is

2016-08-04 13:26:35,475 DEBUG [Curator-Framework-0] o.a.n.c.s.StandardControllerServiceNode StandardControllerServiceNode[service=DBCPConnectionPool[id=5609ac16-0174-1000-eeee-ffffd19aae44]
0
On

If you only want to extract the id then you can use the following pattern.

\[id=%{DATA:id}\]

This will match on the [id= and then grab everything before the next ].

You could make this more robust by defining a custom pattern that more accurately matched the ID pattern, rather than using DATA.

I use the following resources for GROK patterns

  1. https://github.com/logstash-plugins/logstash-patterns-core/blob/master/patterns/grok-patterns
  2. https://grokconstructor.appspot.com/do/match