I have a simple grammar for evaluating logical expressions.
The "single" conditions are of the type
keyword = value
keyword != value
Then, I want to allow logical combinations grouped with parentheses, e.g.
cond1 & cond2 & cond3
cond1 & ( cond2 | cond3 )
(cond1 & cond2) | cond3
cond1 | cond2 | cond3
but not logical combinations where left/right associativity would matter and confuse the users. So the following would be disallowed:
cond1 & cond2 | cond3
If I understand it right, this latter requirement stops me from using operatorPrecedence()
,
and I would anyway like to understand it on a lower level.
The following grammar takes me almost where I want:
from pyparsing import Word, alphas, nums, oneOf, Literal, Group, Suppress, \
Forward, ZeroOrMore, ParseException, StringEnd
comparison_op_list = ['=', '!=']
logical_op_list = ['&', '|']
# Pyparsing expression for single conditions
keyword = Word(alphas + nums + '_.:-')
comparison_op = oneOf(comparison_op_list)
value = Word(alphas + nums + '_.:-;')
single_condition = keyword + comparison_op + value
# Pyparsing expression for combined condition
lpar = Literal( '(' )
rpar = Literal( ')' )
logical_op = oneOf(logical_op_list)
combined_expr = Forward()
atom = single_condition | ( Suppress(lpar) + combined_expr + Suppress(rpar) )
combined_expr << Group(atom) + ZeroOrMore( logical_op + combined_expr )
# Examples
test_strings = [
'keyword = value',
'keyword1 = value1 & keyword2 = value2',
'a=a & (b=b|c=c) & (d=d & (e=e|f=f))',
'a=a & ((b=b|(c=c))) & (((d=d) & (e=e|f=f)))',
'test1=A & test2=B | test3 = C' # Parses fine, but later rejected in python code
]
for s in test_strings:
print
print s
print combined_expr.parseString(s)
Output:
keyword = value
[['keyword', '=', 'value']]
keyword1 = value1 & keyword2 = value2
[['keyword1', '=', 'value1'], '&', ['keyword2', '=', 'value2']]
a=a & (b=b|c=c) & (d=d & (e=e|f=f))
[['a', '=', 'a'], '&', [['b', '=', 'b'], '|', ['c', '=', 'c']], '&', [['d', '=', 'd'], '&', [['e', '=', 'e'], '|', ['f', '=', 'f']]]]
a=a & ((b=b|(c=c))) & (((d=d) & (e=e|f=f)))
[['a', '=', 'a'], '&', [[['b', '=', 'b'], '|', [['c', '=', 'c']]]], '&', [[[['d', '=', 'd']], '&', [['e', '=', 'e'], '|', ['f', '=', 'f']]]]]
test1=A & test2=B | test3 = C
[['test1', '=', 'A'], '&', ['test2', '=', 'B'], '|', ['test3', '=', 'C']]
From this I can handle valid input in regular python. The problem is that some invalid grammar is also read without error:
invalid_strings = [
'a b = c', # Rightfully rejected
'word1 = word2 != word3', # Read as 'word1 = word2'
'keyword = value_with_*illegal*_characters' # Read as 'keyword = value_with_'
]
for s in invalid_strings:
try:
result = combined_expr.parseString(s)
except ParseException:
result = None
print
print s
print result
Output:
a b = c
None
word1 = word2 != word3
[['word1', '=', 'word2']]
keyword = value_with_*illegal*_characters
[['keyword', '=', 'value_with_']]
For the above examples without logical combinations, I could probably use StringEnd to require that the string ends immediately after the expression. But that would not work when conditions are combined with logical operators(?) Is there a way to require that all parts of an input expression be recognized as part of the grammar, or would this be out of pyparsing's scope and the task of e.g. PLY?