I'm trying to write an interpreter for LOLCODE that reads escaped strings from a file in the form:
VISIBLE "HAI \" WORLD!"
For which I wish to show an output of:
HAI " WORLD!
I have tried to dynamically generate a format string for printf in order to do this, but it seems that the escaping is done at the stage of declaration of a string literal.
In essence, what I am looking for is exactly the opposite of this question: Convert characters in a c string to their escape sequences
Is there any way to go about this?
It's a pretty standard scanning exercise. Depending on how close you intend to be to the LOLCODE specification (which I can't seem to reach right now, so this is from memory), you've got a few ways to go.
Write a lexer by hand
It's not as hard as it sounds. You just want to analyze your input one character at a time, while maintaining a bit of context information. In your case, the important context consists of two flags:
"
and cleared when reading"
.\
and cleared when reading the character after that, no matter what it is.Then the general algorithm looks like: (pseudocode)
You might want to refine the
inEscape
case should you want to implement\r
,\n
and the like.Use a lexer generator
The traditional tools here are lex and flex.
Get inspiration
You're not the first one to write a LOLCODE interpreter. There's nothing wrong with peeking at how the others did it. For example, here's the string parsing code from lci.