In KIbana query + ignore hypen

102 Views Asked by At

I need help on search query!

I have below 'n' number of messages in my kibana and I want to extract only "arnold-123-20" string in below message field whereas hypen(-) is ignored my search and also in time stamp the number(20) gets matched which is wrong and I need to ignore that.

message:Oct 17 01:26:20 arnold-123-20.us.com arnold: [INFO]- Successful

Search Query in kibana UI:

message:"arnold" AND message:"123-20" AND message:'Successfully'
1

There are 1 best solutions below

0
Alcanzar On

The standard Elasticsearch tokenizer breaks on word boundaries. The - characters are considered word boundaries. So internally ES is storing message as [oct,17,01,26,20(2),arnold(2),123,us,com,info,successful] (basically it's a vector of terms + frequency, ignoring order of terms.)

You'll have to create a custom tokenizer that recognizes the tokens in your data and re-index your data using that. Then your search might work.

A better solution is to use logstash to parse certain types of messages and store the data in different fields. For example, you could store arnold-123-20 has hostPart, us.com as hostDomain, and arnold-123-20.us.com as fullHost. You'd also need to add an index template that marks different fields as not_analyzed.