Load multiple increasing json files by ELK stack

140 Views Asked by At

I crawled a lot of JSON files in data folder, which all named by timestamp (./data/2021-04-05-12-00.json, ./data/2021-04-05-12-30.json, ./data/2021-04-05-13-00.json, ...).

Now I'm tring to use ELK stack to load those increasing JSON files.

The JSON file is pretty printed like:

{
    "datetime": "2021-04-05 12:00:00", 
    "length": 3,
    "data": [
        {
            "id": 97816,
            "num_list": [1,2,3],
            "meta_data": "{'abc', 'cde'}"
            "short_text": "This is data 97816"
        },
        {
            "id": 97817,
            "num_list": [4,5,6],
            "meta_data": "{'abc'}"
            "short_text": "This is data 97817"
        },
        {
            "id": 97818,
            "num_list": [],
            "meta_data": "{'abc', 'efg'}"
            "short_text": "This is data 97818"
        },
    ],

}

I tried using logstash multiline plugins to extract json file, but it seems like it will handle each file as an event. Is there any way to extract each record in JSON data fileds as an event ?

Also, what's the best practice for loading multiple increasing pretty-printed JSON files in ELK ?

1

There are 1 best solutions below

0
On

Using multiline is correct if you want to handle each file as one input event.

Then you need to leverage the split filter in order to create one event for each element in the data array:

filter {
  split {
    field => "data"
  }
}

So Logstash reads one file as a whole, it passes its content as a single event to the filter layer and then the split filter as shown above will spawn one new event for each element in the data array.