I am learning Python and have very limited programming knowledge, as a learning project I have a .txt system log that I am trying to convert to JSON.
I want the python program to parse through the .txt file making each event an object and the entries for that event to be split into key:value pairs. This is so later on I can query and summarise the alerts in the log. eventually I want the program to accept user input to query the JSON ( but that is for another day).
my current script is looking like this
import re
import json
import os
def parse_log_file(input_file):
events = []
with open(input_file, 'r') as file:
log_content = file.read()
# extract individual events
event_pattern = re.compile(r'Event \d+\s+(.*?)\s+(?=(?:Event \d+|$))', re.DOTALL)
matches = event_pattern.findall(log_content)
for match in matches:
event_dict = {}
lines = match.split('\n')
for line in lines:
if line.strip():
key, value = map(str.strip, line.split(':', 1))
event_dict[key] = value
events.append(event_dict)
# Write the JSON output with the same name as the input file
output_file = os.path.splitext(input_file)[0] + ".json"
with open(output_file, 'w') as json_file:
json.dump(events, json_file, indent=4)
print(f"JSON file saved as,{output_file}")
if __name__ == "__main__":
input_file = "log.txt"
parse_log_file(input_file)
Desired output:
Event 1
{ "LogName" : "System",
"MachineName" : "LAPTOP" ,
"ProviderName" : "Intel",
"LevelDisplayName" : "Information",
"Message: : "Check the remaining resource budget. Module exceeds resource budget, failed to AllocateFwCps,
STATUS = Insufficient system resources exist to complete the API.." },Event 2 {
"LogName" : "System",
"MachineName" : "LAPTOP"
"ProviderName" : "Microsoft-Windows-Kernel-Power"
"LevelDisplayName" : "Information"
"Message" : "The system session has transitioned from 186 to 188. Reason InputPoUserPresent
BootId: 67"
}
however my output currently looks like this:
LogName : "System
MachineName : LAPTOP
ProviderName : Microsoft-Windows-Kernel-Power
LevelDisplayName : Information
Message : The system session has transitioned from 186 to 188. Reason InputPoUserPresent
BootId: 67"
Where I am going wrong? Ideally I would like each element of the alert i.e. LogName, MachineName etc.. to be the keys and the information to be the value
Use print statements to troubleshoot this, at different stages. Firstly, try and print 'matches' from
if that is what you want, move on. print 'lines' and makes sure it is what you want.
Use that methodology and you'll get it.