Handling out of reach environments when implementing an interpreter

64 Views Asked by At

I am writing an interpreter based on the book Crafting Interpreters by Robert Nystrom. To test the functionality of my for loops, I am printing the first 21 numbers of a Fibonacci sequence using Lox programming language.

var a = 0;
var temp;

for (var b = 1; a < 10000; b = temp + b) {
  print a;
  temp = a;
  a = b;
}

The book discusses desugaring for loops; if for loops didn't support initializer clauses, put the initializer expression before the for statement. Without an increment clause, put the increment expression at the end of the body.

{
  var i = 0;
  while (i < 10) {
    print i;
    i = i + 1;
  }
}

My parser logic to handle for loops looks for a for-token. If found, the for_statement method gets called.

def for_statement(self):
        self.consume(TokenType.LEFT_PAREN,  "Expect '(' after 'for'.")

        if self.match(TokenType.SEMICOLON):
            initializer = None
        elif self.match(TokenType.VAR):
            initializer = self.var_declaration()
        else: 
            initializer = self.expression_statement()

        condition = None
        if not self.check(TokenType.SEMICOLON):
            condition = self.expression()
        self.consume(TokenType.SEMICOLON,  "Expect ';' after loop condition.")

        increment = None
        if not self.check(TokenType.RIGHT_PAREN):
            increment = self.expression()
        self.consume(TokenType.RIGHT_PAREN,  "Expect ')' after for clauses.")

        body = self.statement()

        if increment is not None:
            body = Block([body, ExprStmt(increment)])
        
        if condition is None:
            condition = Literal(True)
        body = WhileStmt(condition, body)

        if initializer is not None: 
            body = Block([initializer, body])

        return body

The issue arises when building the AST nodes in the for_statement method. The body gets parsed once the initializer, condition, and increment. The body returns as a Block object since it's wrapped in { }. A block object only holds a statements field: the statements print a, temp = a, etc.

The returned body looks something like this. To understand how each AST node is built, I suggest looking at this link: AST Builder

Block(statements=[VarDeclaration(name=Token(type=TokenType.IDENTIFIER, lexeme='b', literal=None, line=3), initializer=Literal(value=1.0)), WhileStmt(test=Binary(left=Variable(name=Token(type=TokenType.IDENTIFIER, lexeme='a', literal=None, line=3)), op=Token(type=TokenType.LESS, lexeme='<', literal=None, line=3), right=Literal(value=10000.0)), body=Block(statements=[Block(statements=[Print(value=Variable(name=Token(type=TokenType.IDENTIFIER, lexeme='a', literal=None, line=4))), ExprStmt(value=Assign(name=Token(type=TokenType.IDENTIFIER, lexeme='temp', literal=None, line=5), value=Variable(name=Token(type=TokenType.IDENTIFIER, lexeme='a', literal=None, line=5)))), ExprStmt(value=Assign(name=Token(type=TokenType.IDENTIFIER, lexeme='a', literal=None, line=6), value=Variable(name=Token(type=TokenType.IDENTIFIER, lexeme='b', literal=None, line=6))))]), ExprStmt(value=Assign(name=Token(type=TokenType.IDENTIFIER, lexeme='b', literal=None, line=3), value=Binary(left=Variable(name=Token(type=TokenType.IDENTIFIER, lexeme='temp', literal=None, line=3)), op=Token(type=TokenType.PLUS, lexeme='+', literal=None, line=3), right=Variable(name=Token(type=TokenType.IDENTIFIER, lexeme='b', literal=None, line=3)))))]))])

The while statement has the condition (test) and body (body) to loop over. As you can see, because of our previous logic, the body gets wrapped in two blocks. The first block gets passed to another block, corresponding to the for loop's body.

That chunk of AST nodes is fed to the interpreter. The interpreter looks at each node and visits it's corresponding method depending on the node. Scopes (Environments) are created using the ChainMap collections module whenever a block is traversed. For example, Block() calls visit_Block() in the interpreter, which calls execute_Block().

def visit_Block(self, node):
        self.execute_Block(node.statements, self.env.new_child())

def execute_Block(self, node, env):
        previous = self.env
        try:
            self.env = env
            for stmt in node: 
                self.visit(stmt)
        finally:
            self.env = previous

This creates the first scope, corresponding to the first var declarations.

ChainMap({'a' : 0.0, 'temp' : None})

When encountering the body of the while statement, the first block creates an environment, and so does the second. As you can imagine, the first block environment will be empty; the latter will hold the body's actual scope. When printing the first value, '0', and updating the 'temp' and 'a' variables, the environment reverts to the first block's environment; it is empty. Out-of-reach scoping prevents the temp variable from being added a number because it is None.

  1. First Body
ChainMap({}, {'b' : 1.0}, {'a' : 0.0, 'temp' : None})
  1. Second Body
ChainMap({'temp' : 0.0, 'a' : 1.0}, {}, {'b' : 1.0}, {'a' : 0.0, 'temp' : None})

I can't find a solution. I think I understand the problem; It has to be because of the double-nested blocks and how I create environments.

First approach: Tried changing the returned object of the body in the for statement method by casting it into a Statements object, but it will still hold the Block object inside. This will only create more objects.

Second approach: Created a helper method that traversed the body if a '{' was found but did not return it as a Block, only a list of statements. Because I am using the visitor design pattern from the book, the interpreter will try to visit visit_List(), which obviously does not work. Everything must be an existing AST node.

0

There are 0 best solutions below