I need extract the following string in Python to constitute a dictionary:
2014:02:02-12:24:17 NAMETEST ulogd[4834]: id="xxxx" severity="xxxx" sys="xxxx" sub="xxxx" name="xxxx aaaa" action="xxxx" fwrule="xxxx" outitf="xxxx" srcmac="xxxx" srcip="xxxx" dstip="xxxx" proto="x" length="xxxx" tos="xxxx" prec="xxxx" ttl="xx" srcport="xxxx" dstport="xxxx" tcpflags="xxxx"
I do not use split(' ')
with space, because for example, the field name="xxxx aaaa"
can contain a space.
first with the following regex I have extracted the data only:
re.findall('"([^"]*)"', line)
But now I need to used an dictionary format like: line['id'] = 1111
.
So the regex? Have you an idea?
You can use
re.findall()
to find the key value pairs:(\w+)="(.*?)"
would match one or more alphanumeric characters (the\w+
part), followed by="
, followed by any characters (.*?
, non-greedy), followed by"
. Parenthesis here define capturing groups.