How do I get Tatsu to not consume the right bracket in the identifier name?

41 Views Asked by At

I have identifier defined as:

identifier = /[A-zA-Z][A-zA-Z0-9_]*/ ;

and arrayType as:

arrayType = ARRAY LBRACK ~ typeList RBRACK OF componentType;

so why is Tatsu deciding that 'ASCIIcode]' is an identifier and not an identity + right bracket in the logs below?

≡'[' 
ASCIIcode] Of ASCIIcode;
≡LBRACK↙arrayType↙unpackedStructuredType↙structuredType↙type↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙typeList↙arrayType↙unpackedStructuredType↙structuredType↙type↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙indexType↙typeList↙arrayType↙unpackedStructuredType↙structuredType↙type↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙simpleType↙indexType↙typeList↙arrayType↙unpackedStructuredType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙scalarType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙LPAREN↙scalarType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
≢'(' 
ASCIIcode] Of ASCIIcode;
≢LPAREN↙scalarType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
≢scalarType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙subrangeType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙constant↙subrangeType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙unsignedNumber↙constant↙subrangeType↙simpleType↙indexType↙typeList↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙unsignedInteger↙unsignedNumber↙constant↙subrangeType↙simpleType↙ ~99:15
ASCIIcode] Of ASCIIcode;
≢'' /\d+/
ASCIIcode] Of ASCIIcode;
↙unsignedReal↙unsignedNumber↙constant↙subrangeType↙simpleType↙indexType↙ ~99:15
ASCIIcode] Of ASCIIcode;
≢'' /\d+/
ASCIIcode] Of ASCIIcode;
≢unsignedNumber↙constant↙subrangeType↙simpleType↙indexType↙typeList↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙sign↙constant↙subrangeType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙PLUS↙sign↙constant↙subrangeType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
≢'+' 
ASCIIcode] Of ASCIIcode;
≢PLUS↙sign↙constant↙subrangeType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙MINUS↙sign↙constant↙subrangeType↙simpleType↙indexType↙typeList↙ ~99:15
ASCIIcode] Of ASCIIcode;
≢'-' 
ASCIIcode] Of ASCIIcode;
≢MINUS↙sign↙constant↙subrangeType↙simpleType↙indexType↙typeList↙ ~99:15
ASCIIcode] Of ASCIIcode;
≢sign↙constant↙subrangeType↙simpleType↙indexType↙typeList↙arrayType↙ ~99:15
ASCIIcode] Of ASCIIcode;
↙identifier↙constant↙subrangeType↙simpleType↙indexType↙typeList↙ ~99:15
ASCIIcode] Of ASCIIcode;
≡'ASCIIcode]' /[A-zA-Z][A-zA-Z0-9_]*/
 Of ASCIIcode;
1

There are 1 best solutions below

0
On BEST ANSWER

The regular expression is incorrect respect what you intend (it's a good idea to test regular expressions on sites like https://pythex.org).

The regex is using an upper case "A" when trying to define the lower case letter range.

You can try using:

identifier = /[a-zA-Z][a-zA-Z0-9_]*/ ;

or even better:

identifier = /\w[\w\d_]*/ ;