When generating the tokens from a txt document using this code:
package codigo;
import static codigo.Tokens.;
%%
%class Lexer
%type Tokens
L=[a-zA-Z_]+
D=[0-9]+
espacio=[ ,\t,\r,\n]+
%{
public String lexeme;
%}
Comment = {TraditionalComment}
TraditionalComment = "/" [^] ~"/" | "/" ""+ "/"
%%
int {return INT;}
if {return IF;}
else {return ELSE;}
while {return WHILE;}
let {return LET;}
put {return PUT;}
return {return RETURN;}
boolean {return BOOLEAN;}
do {return DO;}
get {return GET;}
function {return FUNCTION;}
void {return VOID;}
string {return STRING;}
{espacio} {/* Ignore /}
{Comment} { / ignore / }
"=" {return OPIGUAL;}
"+" {return OPSUMA;}
"-" {return OPRESTA;}
"" {return OPMUL;}
"/" {return OPDIV;}
"(" {return PARENTESISABIERTO;}
")" {return PARENTESISCERRADO;}
"{" {return LLAVEABIERTA;}
"}" {return LLAVECERRADA;}
"," {return COMA;}
";" {return PUNTOYCOMA;}
"%=" {return RESTOIGUAL;}
{L}({L}|{D})* {lexeme=yytext(); return ID;}
("(-"{D}+")"|{D}+) {lexeme=yytext(); return ENTERO;}
. { /* Ignore other characters in the default state */ }
it detects the multiline comments as tokens and not the whole thing as a comment, I would also like to detect character Strings, genereted when detecting characters between "", but I run into the same problem, they are detected separately