Ply : problem while defining rules for the "c" language

1.1k Views Asked by At

I'm trying to write a parser for the c language which will be able to take care of expressions, assignments, if-else and while loops.

here are my rules :

expression -> expression op expression
expression -> ID
expression -> NUMBER
statement -> ID ASSIGN expression
statement -> IF expression THEN statement
statement -> WHILE expression THEN statement

Everything in caps is a token(terminal symbol)

When parsing the string "while h<=0 then t=1", it seems to consider "h" as an expression (using the rule expression->ID). So, the expression in "WHILE expression THEN statement" becomes "h". Obviously, I would want it to consider "h<=0" as the expression (using the rules expression -> expression op expression). How do I make sure that this happens?

1

There are 1 best solutions below

0
On

Building on the previous post where you were asking about the ply.lex module, the snippet below seems like a partial implementation of a c-like grammar. I haven't used ply much, but one of the tricks seems to be that you need to define the grammar rules in the correct order.

tokens   = [ 'ID', 'NUMBER', 'LESSEQUAL', 'ASSIGN' ]
reserved = {
    'while' : 'WHILE',
    'then'  : 'THEN',
    'if'    : 'IF'
}
tokens += reserved.values()

t_ignore    = ' \t'
t_NUMBER    = r'\d+'
t_LESSEQUAL = r'<='
t_ASSIGN    = r'\='

def t_ID(t):
    r'[a-zA-Z_][a-zA-Z0-9_]*'
    if t.value in reserved:
        t.type = reserved[ t.value ]
    return t

def t_error(t):
    print 'Illegal character'
    t.lexer.skip(1)

def p_statement_assign(p):
    'statement : ID ASSIGN expression'
    p[0] = ('assign', p[1], p[3] )

def p_statement_if(p):
    'statement : IF expression THEN statement'
    p[0] = ('if', p[2], p[4] )

def p_statement_while(p):
    'statement : WHILE expression THEN statement'
    p[0] = ('while', p[2], p[4] )

def p_expression_simple(p):
    '''expression : ID
                  | NUMBER'''
    p[0] = p[1]

def p_expression_cmp(p):
    'expression : expression LESSEQUAL expression'
    p[0] = ( '<=', p[1], p[3] )

import ply.lex as lex
import ply.yacc as yacc
lexer = lex.lex()
parser = yacc.yacc()
inp = 'while h<=0 then t=1'
res = parser.parse(inp)
print res

Output from the snippet is:

('while', ('<=', 'h', '0'), ('assign', 't', '1'))