why my flexc++ lexer breaks a whole word into pieces, instead of showing lexical error

18 Views Asked by At

I want to make a lexer for c-subset.here is the code.I tested it with something like 3l12 or 012977,they are invalid id and invalid octal integer,so it should return WRONG. However, the lexer breaks them into 3 l12 012 977, and recognize them individually.

ID [a-z_A-Z][a-z_A-Z0-9]*
EXP ([Ee][\+\-]?[0-9]+)
FLOAT (([0-9]*\.[0-9]+|[0-9]+\.){EXP}?[fF]?)|[0-9]+{EXP
INT ([1-9][0-9]*|0[0-7]*|0[Xx][0-9a-fA-F]+)}
HEXFLOAT 0[Xx][0-9a-fA-F]*(.)[0-9a-fA-F]*[Pp][\+\-]?[0-9]+
MultilineComment "/*"([^\*]|(\*)*[^\*/])*(\*)*"*/"
SingleLineComment "//".*
%%

"int"        {return INT;}
"float"      {return FLOAT;}
"void"       {return VOID;}
"const"      {return CONST;}
"return"     {return RETURN;}
"if"         {return IF;}
"else"       {return ELSE;}
"while"      {return WHILE;}
"break"      {return BREAK;}
"continue"   {return CONTINUE;}
">"  {return GT;}
"<"  {return LT;}
">=" {return GE;}
"<=" {return LE;}
"==" {return EQ;}
"!=" {return NEQ;}
"("         {return LP;}
")"         {return RP;}
"["         {return LB;}
"]"         {return RB;}
"{"         {return LC;}
"}"         {return RC;}
","         {return COMMA;}
";"         {return SEMICOLON;}
"!"         {return NOT;}
"="         {return ASSIGN;}
"-"         {return MINUS;}
"+"         {return ADD;}
"*"         {return MUL;}
"/"         {return DIV;}
"%"         {return MOD;}
"&&"    {return AND;}
"||"    {return OR;}
{ID}         {return ID;}
{INT}               {return INT_LIT;}
{FLOAT}|{HEXFLOAT}  {return FLOAT_LIT;}
[ \t\n]+ {}
{SingleLineComment} {}
{MultilineComment} {}

.           {return WRONG;}

I tried to fix it by changing the order of rules,and removing the rule eating the space. but it didn't work. How could I fix this problem.do I need to add rules specific for these can't of errors?

0

There are 0 best solutions below