How to take output of a query function and provide it as input to second one in Humio

2.5k Views Asked by At

I'm trying to build a query in humio as below

regex(regex=".*MY NAME IS (?)", field=MESSAGE) | MESSAGE=${name}

Example of my server logs:


  • MY NAME IS John
  • John logged in on Monday
  • MY NAME IS SID
  • SID logged in on Tuesday
  • SID logged out
  • LOHI logged in on Wednesday
  • LOHI logged out

First part of the query is a regex function trying to retrieve all records that start with MY NAME IS from MESSAGE column and take the name and then I want to provide that name value to second statement to search the MESSAGE column data

So per above server log example: I need a query that can return below rows in HUMIO:


  • MY NAME IS John
  • John logged in on Monday
  • MY NAME IS SID
  • SID logged in on Tuesday
  • SID logged out

it should not return below rows as there is no MY NAME IS log statement

  • LOHI logged in on Wednesday
  • LOHI logged out
1

There are 1 best solutions below

2
On

Okay, you're looking for something like this then:

regex("^MY NAME IS (?<userNameOfInterest>.*)", field=MESSAGE)
| join({ /* MESSAGE contains userNameOfInterest */ })

This takes a bit of work, since Humio can't check if a string contains a dynamic substring (at the moment). If the MESSAGE field has a limited number of permutations that you know up front, you can do something like this:

regex("^MY NAME IS (?<userNameOfInterest>.*)", field=MESSAGE)
| join({
    MESSAGE match {
      "* logged in on *" => regex("(?<userName>\S*) logged.*", field=MESSAGE);
      "* logged out" => regex("(?<userName>\S*) logged.*", field=MESSAGE);
      "MY NAME IS *" => regex("^MY NAME IS (?<userName>.*)$", field=MESSAGE); }
  }, field=userNameOfInterest, key=userName)

The join lets you combine two sets of data. In this case, the first regex finds all applicable user names, and the subquery extracts the user names from the other log events, so you can get the events that have a match.

If you go for this approach, you can also consider doing the match inside a parser, such that there is always a userName field ready to work with.


Alternatively, if you control the logs yourself, maybe you can tweak the messages by adding a user name field at the time they are sent?

  • "MY NAME IS John. userName=John"
  • "John logged in on Monday. userName=John"
  • "MY NAME IS SID. userName=SID"
  • "SID logged in on Tuesday. userName=SID"
  • "SID logged out. userName=SID"
  • "LOHI logged in on Wednesday. userName=LOHI"
  • "LOHI logged out. userName=LOHI"

Then the query would be much more trivial to do, as the join subquery only needs to look for log events that contain any user name that it can join on:

regex("^MY NAME IS (?<userNameOfInterest>.*)", field=MESSAGE)
| join({ userName=* }, field=userNameOfInterest, key=userName)

I hope some of this is useful :)