I have to write parse(Tkns, T) that takes in a mathematical expression in the form of a list of tokens and finds T, and return a statement representing the abstract syntax, respecting order of operations and associativity.
For example,
?- parse( [ num(3), plus, num(2), star, num(1) ], T ).
T = add(integer(3), multiply(integer(2), integer(1))) ;
No
I've attempted to implement + and * as follows
parse([num(X)], integer(X)).
parse(Tkns, T) :-
( append(E1, [plus|E2], Tkns),
parse(E1, T1),
parse(E2, T2),
T = add(T1,T2)
; append(E1, [star|E2], Tkns),
parse(E1, T1),
parse(E2, T2),
T = multiply(T1,T2)
).
Which finds the correct answer, but also returns answers that do not follow associativity or order of operations.
ex)
parse( [ num(3), plus, num(2), star, num(1) ], T ).
also returns
mult(add(integer(3), integer(2)), integer(1))
and
parse([num(1), plus, num(2), plus, num(3)], T)
returns the equivalent of 1+2+3 and 1+(2+3) when it should only return the former.
Is there a way I can get this to work?
Edit: more info: I only need to implement +,-,*,/,negate (-1, -2, etc.) and all numbers are integers. A hint was given that the code will be structured similarly to the grammer
<expression> ::= <expression> + <term>
| <expression> - <term>
| <term>
<term> ::= <term> * <factor>
| <term> / <factor>
| <factor>
<factor> ::= num
| ( <expression> )
Only with negate implemented as well.
Edit2: I found a grammar parser written in Prolog (http://www.cs.sunysb.edu/~warren/xsbbook/node10.html). Is there a way I could modify it to print a left hand derivation of a grammar ("print" in the sense that the Prolog interpreter will output "T=[the correct answer]")
The correct approach is to use DCGs, but your example grammar is left-recursive, which won't work. Here's what would:
The relationship between this and your sample grammar should be obvious, as should the transformation from left-recursive to right-recursive. I can't recall the details from my automata class about left-most derivations, but I think it only comes into play if the grammar is ambiguous, and I don't think this one is. Hopefully a genuine computer scientist will come along and clarify that point.
I see no point in producing an AST other than what Prolog would use. The code within parenthesis on the left-hand side of the production is the AST-building code (e.g. the
T+Ein the firstexpression//1rule). Adjust the code accordingly if this is undesirable.From here, presenting your
parse/2API is quite trivial:Because we're using Prolog's own structures, the result will look a lot less impressive than it is:
You can show a more AST-y output if you like using
write_canonical/2:The part
*(4,/(8,+(3,1)))is the result ofwrite_canonical/1. And you can evaluate that directly withis/2: