How to remove ambiguity of this grammar using lark

142 Views Asked by At

I have build a grammar for propositional formulas, however I have found that it is ambiguous, i.e., there is more than one derivation tree for my input formulas.

This is the smallest ambiguous grammar I have found

_PL_grammar = (
    r"""
    ?start: formula
    ?formula: "(" formula ")"
            | logical_expr
            | ident 
    ?logical_expr: and_expr
    ?and_expr:  formula ("&&" | "&") formula
    ident: CNAME
    %import common.CNAME
    %import common.WS_INLINE
    %ignore WS_INLINE
    """)

I am executing this piece of code with lark==1.1.2

import lark
def parse_formula(formula):
    logic_parser = lark.Lark(_PL_grammar )
    parsed_formula = logic_parser.parse(formula)
    return parsed_formula
parsed_formula = parse_formula('a && b && c')
print(parsed_formula)

Which returns arbitrary one of these two outputs:

Tree('and_expr', [Tree('and_expr', [Tree('ident', [Token('CNAME', 'a')]), Tree('ident', [Token('CNAME', 'b')])]), Tree('ident', [Token('CNAME', 'c')])])

or

Tree('and_expr', [Tree('ident', [Token('CNAME', 'a')]), Tree('and_expr', [Tree('ident', [Token('CNAME', 'b')]), Tree('ident', [Token('CNAME', 'c')])])])

Which would correspond to formulas ((a && b) && c) and (a && (b && c)) respectively

How can I remove the ambiguity of this grammar? In other words, is there a way to make an input always have only one unique output?

0

There are 0 best solutions below