I inherited an Antlr 4 parser (in Java) that uses the listener method but has some walkers that need to be special-cased.
Background information:
When you write a parser grammar in Antlr (v4) it generates some Java classes for you, one for each of your parser non-terminals. These are classes like:
PowerShellParser.ForEachObjectStatementContext
for the parser rule that defines a ForEach Statement
. You get literally dozens (often approaching 100 classes) for a reasonable size grammar. And they are part of a Java class hierarchy like this:
ParseTree
|
+ - ErrorNode // used when there are errors in the program being compiled
|
+ - TerminalNode // tokens, e.g. identifiers, '+', '=', whitespace, constants
|
* - RuleNode // non-terminals, subclassed further
|
+ - ProgramContext // for the Program rule/non-terminal
|
+ - ForEachStatementContext // for the ForEach Statement rule
|
+ - IfThenElseStatementContent // if (expr) then-clause else-clause?
|
+ - BlockStatementContext // { statements* }
|
+ - AssignmentStatementContext // var = expression
|
+ - AddExpressionContext // expression + expression
|
and literally many, many more.
And antlr also generates a "listener" pattern to make it easy to define the semantics for these different bits of code. This pattern consists of the above tree of classes and two functions with default implementations you can change. The walker
function and the listener
function. The normal walker function does the following thing, calls the "enter(t)" function of the listener, recursively walks the children in the tree and calls the exit(t) function of the listener.
The original antlr code at https://github.com/antlr/antlr4/blob/master/runtime/Java/src/org/antlr/v4/runtime/tree/ParseTreeWalker.java looks roughly like this:
public class Walker {
public void walk(ParseTree t) {
if (t instanceof ErrorNode) {
listener.visitErrorNode((ErrorNode)t);
} else if (t instanceof TerminalNode) {
listener.visitTerminalNode((TerminalNode)t);
} else if (t instanceof RuleNode) {
RuleNode r = (RuleNode)t;
ParserRuleContext ctx =
(ParserRuleContext)r.getRuleContext();
listener.enterEveryRule(ctx);
listener.enterRule(r); // Here we call the listener
int n = r.getChildCount();
for (Integer i = 0; i < n; ++i) {
ParseTree child = r.getChild(i);
walk(child); // Here we recursively call ourselves
}
listener.exitRule(r); // Here we call the listener again
}
}
Now, in the listener, you write code like:
public class listener {
public void enterRule(PowerShellParser.ProgramContext ctx) {
// Here we are starting a program and write whatever semantics we need.
}
public void exitRule(PowerShellParser.ProgramContext ctx) {
// Here we are done with a program and write whatever semantics we need.
}
and so on for all the classes that have semantics you care about. You can leave some of these functions out and the default is to do nothing.
This works fine for "expressions" because if you put code in the exit functions you get a nice way of evaluating expressions from the bottom up, essentially for free.
However, such a traversal doesn't work for if-statements or for-loops etc. Because you don't want to walk every node blindly.
So, the person who wrote the code I inherited modified the walker code as I show below. It works, but it has this layer of if-statements that as I add more types of statements I have to keep expanding. I want the recursive call to the walker to work just like the listener code does, calling the version that matches the "type" of non-terminal class that is in the tree, without me having to list them in an if-statement. Checking the type of an object is a "smell" in my book. You should just write code with the right signature and it should get called. But that isn't happening on the recursive calls like it does when the walker calls the listener and passes the object. And, I want to know how to fix it so it does.
So. thus my original question:
However, when the walker recursively calls itself on the children it doesn't pick the special case versions and the author had to insert code to check for the type and call the right one. What is being done wrong that the ifs in this code are needed:
package walker;
import org.antlr.v4.runtime.ParserRuleContext;
import org.antlr.v4.runtime.tree.*;
public class Walker {
public void walk(ParseTree t) {
// why do I need this if statement, why isn't the right function called?
if (t instanceof ErrorNode) {
walk((ErrorNode)t);
} else if (t instanceof TerminalNode) {
walk((TerminalNode)t);
} else if (t instanceof RuleNode) {
walk((RuleNode)t);
} else {
notImplemented("walk ParseTree for " + t);
}
}
public void walk(ErrorNode t) {
listener.visitErrorNode(t);
}
public void walk(TerminalNode t) {
listener.visitTerminal(t);
}
public void walk(RuleNode r) {
// same issue, only worse
if (r instanceof PowerShellParser.ForEachObjectStatementContext) {
walk((PowerShellParser.ForEachObjectStatementContext)r);
return;
} else if (r instanceof PowerShellParser.ForEachStatementContext) {
walk((PowerShellParser.ForEachStatementContext)r);
return;
} else if (r instanceof PowerShellParser.IfStatementContext) {
walk((PowerShellParser.IfStatementContext)r);
return;
} else if (r instanceof PowerShellParser.DoWhileStatementContext) {
walk((PowerShellParser.DoWhileStatementContext)r);
return;
} else if (r instanceof PowerShellParser.HardCase1Context) {
walk((PowerShellParser.HardCase1Context)r);
return;
} else if (r instanceof PowerShellParser.HardCase2Context) {
walk((PowerShellParser.HardCase2Context)r);
return;
} else if (r instanceof PowerShellParser.HiddenStringMethodExpressionContext) {
walk((PowerShellParser.HiddenStringMethodExpressionContext)r);
return;
} else if (r instanceof PowerShellParser.InvokeCommandCommandContext) {
walk((PowerShellParser.InvokeCommandCommandContext)r);
return;
} else if (r instanceof PowerShellParser.ExpressionListContext) {
walk((PowerShellParser.ExpressionListContext)r);
return;
} else if (r instanceof PowerShellParser.JoinWithExpressionListContext) {
walk((PowerShellParser.JoinWithExpressionListContext)r);
return;
}
// this is all I want this function to do, provide a default traversal
// when I don't have a specialized version
ParserRuleContext ctx = (ParserRuleContext)r.getRuleContext();
listener.enterEveryRule(ctx);
enterRule(r);
int n = r.getChildCount();
for (Integer i = 0; i < n; ++i) {
ParseTree child = r.getChild(i); // I presume the issue is here
walk(child);
}
exitRule(r);
listener.exitEveryRule(ctx);
}
public void walk(PowerShellParser.ForEachObjectStatementContext r) {
// I want recursive calls of this object to call this directly
// and not go through the two ifs above.
// E.g. this version walks its children multiple times
// The if version only walks some of its children
// The hard cases actually do something like "eval"
}
public void walk(PowerShellParser.ForEachStatementContext r) {
}
public void walk(PowerShellParser.IfStatementContext r) {
}
public void walk(PowerShellParser.DoWhileStatementContext r) {
}
public void walk(PowerShellParser.JoinWithExpressionListContext r) {
}
public void walk(PowerShellParser.QuotedProgramContext r) {
}
public void walk(PowerShellParser.HardCase1Context r) {
}
public void walk(PowerShellParser.HardCase2Context r) {
}
}
After studying the code in
listener
more carefully, I see that even it doesn't do exactly what I want. It renames eachenterRule(Xxxx ctx)
to beenterXxxx(Xxxx ctx)
and code in the Xxxx class redirects it. Thus, you cannot easily define a function that dispatches the way I want in Java, at least not if the class you want to dispatch on you cannot modify (which in this case, I cannot).