I am writing the parser for first-order conditions. I am using Scala and the scala-parser-combinators library. It works well, but I discovered this issue.
Here, I report the clause for negation (example: not predicate(p1)). It uses recursion, calling fol_condition, i.e. the root level rule for fol conditions:
def fol_negation : Parser[Negation] = "not"~fol_condition ^^ {case _~sub => Negation(sub)}
So, a string like "not predicate(p1)" produces the correct result Negation(Predicate("predicate",...)) However, when I parse a string like "note(p2)", it should produce a Predicate("note",...); instead, it returns Negation(Predicate(a,...)). In other words, the parser ignores whether there is a space between "not" and the following fol_condition.
This is not the desired behaviour. I would inform the parser that "not a" is different from "nota" because there is a space in between.
How can I fix this?
It's not about whitespaces. It's about the precedence between reserved keywords and identifiers.
Neither in
not(P(x))nor innote(y)are there any whitespaces, yet the problem is the same.You don't want to force the user to type a whitespace after every
not, becausenot (P(x) and Q(y))would be awkward.What you might try to do instead:
A. Reorder your grammar such that all rules that start with an identifier come first; Then the branch in which
note(y)is parsed asnote-identifier will be taken first, and the problem disappears; For this to work, you obviously would have to exclude all the reserved keywords from the set of valid identifiers.B. If you just want a quick fix, use a lookahead regex like
"not(?=\W)".rinstead of the fixed character sequence"not", and make thenot-rule (and all the other keyword-related rules) come before theidentifier-rules.