Context
I auto-generated the:
- Python3Lexer.java
- Python3ParserBase.java
- Python3ParserListener.java
- PythonDocstringModifierListener.java
- Python3Parser.java files in accordance with this answer. Then I modified the MWE in that question to include:
public class SomePythonListener extends Python3ParserBaseListener {
public SomePythonListener
Python3Parser parser, String someValue) {
this.parser = parser;
this.someValue = someValue;
}
@Override
public void visitTerminal(TerminalNode node) {
Token token = node.getSymbol();
System.out.println("token.getType()=" + token.getType());
System.out.println("getText:" + token.getText() + "XXXX\n\n");
}
}
And I feed it the source code:
"""A file docstring.
With a multiline starting docstring.
That spans the first 3 lines."""
# Some Comment.
# Another comment
"""Some string."""
def foo():
"""Some docstring."""
print('hello world')
def bar():
"""Another docstring."""
print('hello world')
def baz():
"""Third docstring."""
print('hello universe')
This then outputs:
token.getType()=3
getText:"""A file docstring.
With a multiline starting docstring.
That spans the first 3 lines."""END
token.getType()=44
getText:
END
token.getType()=3
getText:"""Some string."""END
token.getType()=44
getText:
END
token.getType()=15
getText:defEND
token.getType()=45
getText:fooEND
token.getType()=57
getText:(END
token.getType()=58
getText:)END
token.getType()=60
getText::END
token.getType()=44
getText: END
token.getType()=1
getText: ENDtoken.getType()=3
For completeness, the 44
represents the new line character, and one can see that the first docstring is included, followed by a new line, followed by the second docstring """Some string."""
, however both comments: # Some Comment.
and # Another comment
are ignored/not visited/not shown.
Issue
The TerminalNode node
objects of the visitTerminal
do not include the comments.
Question
How can I include the comments in the visitor?
Attempt
Based on these answers it seems I should get those from the hidden channels. I did not yet figure out how to do that. For completeness, the auto-generated Python3Lexer.java
file contains:
public static String[] channelNames = {"DEFAULT_TOKEN_CHANNEL", "HIDDEN"};
public static String[] modeNames = {"DEFAULT_MODE"};
That is correct: these tokens are skipped in the lexer. You can also put these tokens on another channel (so not skip them) by replacing
-> skip
with-> channel(HIDDEN)
. But that will still not cause them to appear in thevisitTerminal(...)
method. After all: only tokens defined in parser rules appear there.For the record, when changing:
to:
in the
Python3Lexer.g4
file and then re-generate lexer/parser classes, you can see comments are now not discarded, but placed on another channel:will print:
But they will still not be a part of the parse tree you're walking with a listener or visitor: only tokens defined in parser rules will show up there.