I am using CocoR to generate a java-like scanner/parser:
I'm having some troubles in creating a EBNF expression to match a codeblock:
I'm assuming a code block is surrounded by two well-known tokens: <& and &> example:
public method(int a, int b) <&
various code
&>
If I define a nonterminal symbol
codeblock = "<&" {ANY} "&>"
If the code inside the two symbols contains a '<' character the generated compiler will not handle it thus giving a syntax error.
Any hint?
Edit:
COMPILER JavaLike
CHARACTERS
nonZeroDigit = "123456789".
digit = '0' + nonZeroDigit .
letter = 'A' .. 'Z' + 'a' .. 'z' + '_' + '$'.
TOKENS
ident = letter { letter | digit }.
PRODUCTIONS
JavaLike = {ClassDeclaration}.
ClassDeclaration ="class" ident ["extends" ident] "{" {VarDeclaration} {MethodDeclaration }"}" .
MethodDeclaration ="public" Type ident "("ParamList")" CodeBlock.
Codeblock = "<&" {ANY} "&>".
I have omitted some productions for the sake of simplicity.
This is my actual implementation of the grammar. The main bug is that it fails if the code in the block contains one of the symbols '>' or '&'.
Nick, late to the party here ...
A number of ways to do this:
Define tokens for
<&
and&>
so the lexer knows about them.You may be able to use a COMMENTS directive
COMMENTS FROM
<&
TO&>
- quoted as CoCo expects.Or make hack NextToken() in your scanner.frame file. Do something like this (pseudo-code):
Or can override the Read() method in the Buffer and eat at the lowest level.
HTH