I am trying to build a Boolean logic parser e.g. A == B AND C == D to output something like And(Equals(A,B), Equals(C,D))
My parser has the following definitions:
def program: Parser[Operator] = {
phrase(operator)
}
def operator: PackratParser[Operator] = {
leaf | node
}
def node: PackratParser[Operator] = {
and | or
}
def leaf: PackratParser[Operator] = {
equal | greater | less
}
def and: PackratParser[Operator] = {
(operator ~ ANDT() ~ operator) ^^ {
case left ~ _ ~ right => And(left, right)}
}
I would expect the parser to map to program -> operator -> node -> and -> operator (left) -> leaf -> equal -> operator (right) -> leaf -> equal. This doesn't work.
However if in the above code I do the changes
def operatorWithParens: PackratParser[Operator] = {
lparen ~> (operator | operatorWithParens) <~ rparen
}
and change and to be
def and: PackratParser[Operator] = {
(operatorWithParens ~ ANDT() ~ operatorWithParens) ^^ {
case left ~ _ ~ right => And(left, right)}
}
Parsing (A == B) AND (C == D) succeeds.
I can not wrap my head around why the former doesn't work while the later does.
How should I change my code to be able to parse A == B AND C == D?
EDIT: Following @Andrey Tyukin advice I've modified the gramma to account for precedence
def program: Parser[Operator] = positioned {
phrase(expr)
}
def expr: PackratParser[Operator] = positioned {
(expr ~ ORT() ~ expr1) ^^ {
case left ~ _ ~ right => Or(left, right)} | expr1
}
def expr1: PackratParser[Operator] = positioned {
(expr1 ~ ANDT() ~ expr2) ^^ {
case left ~ _ ~ right => And(left, right)} | expr2
}
def expr2: PackratParser[Operator] = positioned {
(NOTT() ~ expr2) ^^ {case _ ~ opr => Not(opr)} | expr3
}
def expr3: PackratParser[Operator] = {
lparen ~> (expr) <~ rparen | leaf
}
And although PackratParser supports left-recursive grammar, I run into an infinite loop that never leaves expr
It looks like there is a path from
operatorto a shorteroperator:You seem to be assuming that the shorter
operator (left)will somehow reduce toleaf, whereas the outermostoperatorwould skip theleafand pick thenode, for whatever reason. What it does instead is just chocking on the firstleafit encounters.You could try to move the
nodebefore theleaf, so that the wholeoperatordoesn't choke on the firstAwhen seeing sth. likeA == B AND ....Otherwise, I'd suggest to refactor it into
where atomic formulas are either
Expect to use quite a few
repSeps.