I crawled a lot of JSON files in data folder, which all named by timestamp (./data/2021-04-05-12-00.json
, ./data/2021-04-05-12-30.json
, ./data/2021-04-05-13-00.json
, ...).
Now I'm tring to use ELK stack to load those increasing JSON files.
The JSON file is pretty printed like:
{
"datetime": "2021-04-05 12:00:00",
"length": 3,
"data": [
{
"id": 97816,
"num_list": [1,2,3],
"meta_data": "{'abc', 'cde'}"
"short_text": "This is data 97816"
},
{
"id": 97817,
"num_list": [4,5,6],
"meta_data": "{'abc'}"
"short_text": "This is data 97817"
},
{
"id": 97818,
"num_list": [],
"meta_data": "{'abc', 'efg'}"
"short_text": "This is data 97818"
},
],
}
I tried using logstash multiline
plugins to extract json file, but it seems like it will handle each file as an event. Is there any way to extract each record in JSON data
fileds as an event ?
Also, what's the best practice for loading multiple increasing pretty-printed JSON files in ELK ?
Using multiline is correct if you want to handle each file as one input event.
Then you need to leverage the
split
filter in order to create one event for each element in thedata
array:So Logstash reads one file as a whole, it passes its content as a single event to the filter layer and then the
split
filter as shown above will spawn one new event for each element in thedata
array.