I have a situation where user can enter commands with optional key value pairs and value may contain spaces ..
here are 4 - different form user input where key and value are separated with = sign and values have space:
"cmd=create-folder name=SelfServe - Test ride"
"cmd=create-folder name=SelfServe - Test ride server=prd"
"cmd=create-folder name=cert - Test ride server=dev site=Service"
"cmd=create-folder name=cert - Test ride server=dev site=Service permission=locked"
Requirement: I am trying to parse this string and split into a dictionary based on the key and value present on a string .
If user enter First form of Statement, that wold produce a dictionary like :
query_dict = {
'cmd' : 'create-folder',
'name' : 'selfserve - Test ride'
}
if user enter second form of statement that would produce /add the additional key /value pair
query_dict = {
'cmd' : 'create-folder',
'name' : 'selfserve - Test ride',
'server' : 'prd'
}
if user enter third form of statement that would produce
query_dict ={
'cmd' : 'create-folder',
'name' : 'cert - Test ride',
'server' : 'dev',
'site': 'Service'
}
forth form produce the dictionary with key/value split like below
query_dict ={
'cmd' : 'create-folder',
'name' : 'cert - Test ride',
'server' : 'dev',
'site': 'Service',
'permission' : 'locked' }
-idea is to parse a string where key and value are separated with = symbol and where the values can have one or more space and extract the matching key /value pair .
I tried multiple methods to match but unable to figure out a single generic regular expression pattern which can match/extract any string where we have this kind of pattern
Appreciate your help.
i tried several pattern map based different possible user input but that is not a scalable approach . example :
i created three pattern to match three variety of user input but it would be nice if i can have one generic pattern that can match any combination of key=values in a string (i am hard coding the key in the pattern which is not ideal
'(cmd=create-folder).*(name=.*).*' ,
'(cmd=create-pfolder).*(name=.*).*(server=.*).*',
'(cmd=create-pfolder).*(name=.*).*(server=.*).*(site=.*)'
I would suggest using
split
, and thenzip
to feed thedict
constructor:Example runs:
Outputs:
Explanation
Using this input as example:
The
split
regex identifies these parts:The strings that are not matched by it will end up a results, so we have these:
The first string is empty, because it is what precedes the first match.
Now, as the regex has a capture group, the string that is captured by that group, is also returned in the result list, at odd indices. So
parts
ends up like this:The keys we are interested in, occur at odd indices. We can get those with
parts[1::2]
, where1
is the starting index, and2
is the step.The corresponding values for those keys occur at even indices, ignoring the empty string at index 0. So we get those with
parts[2::2]
. With the call tozip
, we pair those keys and values together as we want them.Finally, the
dict
constructor can take an argument with key/value pairs, which is exactly what thatzip
call provides.