I have been having some trouble trying to find a good example to go off of for being able to handle strings in ocamllex. I found the desktop calculator example to be somewhat useful but haven't really found a way to implement it in a similar fashion in which it uses strings as well, here is the example I'm referencing:
{
open Parser (* The type token is defined in parser.mli *)
exception Eof
}
rule token = parse
[' ' '\t'] { token lexbuf } (* skip blanks *)
| ['\n' ] { EOL }
| ['0'-'9']+ as lxm { INT(int_of_string lxm) }
| '+' { PLUS }
| '-' { MINUS }
| '*' { TIMES }
| '/' { DIV }
| '(' { LPAREN }
| ')' { RPAREN }
| eof { raise Eof }
Any help would be greatly appreciated.
I assume you're talking about double-quoted strings as in OCaml. The difficulty in lexing strings is that they require some escape mechanism to allow representing quotes (and the escape mechanism itself, usually).
Here is a simplified version of the code for strings from the OCaml lexer itself:
Edit: The key feature of this code is that it uses a second scanner (named
string) to do the lexical analysis within a quoted string. This generally keeps things cleaner than trying to write a single scanner for all tokens--some tokens are quite complicated. A similar technique is often used for scanning comments.