I'm trying to make a tool like ANTLR from scratch in Swift (just for fun). But I don't understand how grammar knows that there should be no whitespaces (identifier example: "_myIdentifier123"):
Identifier
: Identifier_head Identifier_characters?
And there should be whitespaces (example "is String"):
type_casting_operator
: 'is' type
| 'as' type
| 'as' '?' type
| 'as' '!' type
;
I've searched for WS in ANTLR's source code, but found nothing. There is no "WS" string in java code: https://github.com/antlr/antlr4
Can anyone explain the algorithm behind this? How it decides whether tokens are separated with whitespaces or not?
The first rule is a lexer rule (note the capital first letter), while the second rule is a parser rule.
The white space token typically is not passed to the parser (in this case there must be a rule to skip white space in the lexer), so the second rule does not see it. Whitespace can appear anywhere between other tokens.
Lexer rules in contrast see all characters from the input, so any white space must be matched explicitly.