How to get Coco/R parser to not be greedy

375 Views Asked by At

My ATG file defines a code block as

Codeblock = "<#" {anychar} "#>"

When the Coco generated parser comes across a block like this:

<#
   a=5;
   print "Hello world!";
#>

The token picks up

a=5;
print "Hello

This is exactly what I want.

However, when it comes across code like this:

<#
   a=5;
   print "Hello World";
#>
<#
   b=5;
   print "Foo Bar";
#>

The token, greedily picks up

 a=5;
 print "Hello World";
 #>
 <#
   b=5;
   print "Foo Bar";

How can I let Coco/R know not to do this?

2

There are 2 best solutions below

3
On

try this:

codeblock = "<#" {anychar} "#>" .
anychar = (expression|procedure) ";" .

by making anychar ended with ";" then cocor cannot mistakenly parse anychar with this pattern "#> <#"

0
On

Your terminals need to be more explicit.

"ANY" introduces ambiguity which is why the #><# is being parsed, your codeblock will treat everything between the FIRST <# and LAST #> as being part of the set "ANY" since that is how your grammar has defined a codeblock.

Perhaps try:

code = codeblock {codeblock} EOF
codeblock = "<#" {anychar} "#>"