In SLY there is an example for writing a calculator (reproduced from calc.py here):
from sly import Lexer
class CalcLexer(Lexer):
tokens = { NAME, NUMBER }
ignore = ' \t'
literals = { '=', '+', '-', '*', '/', '(', ')' }
# Tokens
NAME = r'[a-zA-Z_][a-zA-Z0-9_]*'
@_(r'\d+')
def NUMBER(self, t):
t.value = int(t.value)
return t
@_(r'\n+')
def newline(self, t):
self.lineno += t.value.count('\n')
def error(self, t):
print("Illegal character '%s'" % t.value[0])
self.index += 1
It looks like it's bugged because NAME and NUMBER are used before they've been defined. But actually, there is no NameError, and this code executes fine. How does that work? When can you reference a name before it's been defined?
Python knows four kinds of direct name lookup: builtins / program global, module global, function/closure body, and class body. The
NAME,NUMBERare resolved in a class body, and as such subject to the rules of this kind of scope.The class body is evaluated in a namespace provided by the metaclass, which can implement arbitrary semantics for name lookups. In specific, the sly
Lexeris aLexerMetaclass using aLexerMetaDictas the namespace; this namespace creates new tokens for undefined names.The
LexerMetais also responsible for adding the_function to the namespace so that it can be used without imports.