Backslash escaped characters in JavaCC token

390 Views Asked by At

I'm writing JavaCC parser for a character stream like this

Abc \(Def\) Gh (Ij; Kl); Mno (Pqr)

and should get it tokenized like this

  1. Abc \(Def\) Gh
  2. LPAREN
  3. Ij
  4. SEMICOLON
  5. Kl
  6. RPAREN
  7. SEMICOLON
  8. Mno
  9. LPAREN
  10. Pqr
  11. RPAREN

The current token definition is

TOKEN:
{
  < WORDCHAR : (~[";", "(", ")"])+ >
   |  <LPAREN: "(">
   |  <RPAREN: ")">
   |  <SEMICOLON: ";">
}

How should I change the WORDCHAR token to include backslash escaped parentheses but not parentheses without leading backslash?

1

There are 1 best solutions below

2
On BEST ANSWER
TOKEN:
{
  < WORDCHAR : (~[";", "(", ")"] | "\\(" | "\\)")+ >
   |  <LPAREN: "(">
   |  <RPAREN: ")">
   |  <SEMICOLON: ";">
}