Are there better ways to require that Ragel consume all of the input? Here is what I'm using now:
=begin
%%{
machine my_lexer;
# ...
# extract tokens and store into `tokens`
# ...
}%%
=end
class MyLexer
%% write data;
def self.run(string)
data = string.unpack("c*")
eof = data.length
tokens = []
%% write init;
%% write exec;
data.length == p ? tokens : nil
end
end
Most of the above is boilerplate, except for the data.length == p
test. It works -- except that it doesn't verify that the lexer ended in a final state. So, I have test cases that give me tokens back even if the entire input was not successfully parsed.
Is there a better way?
(Testing for the final state directly might work better. I'm looking into how to do that. Ideas?)
I'm only starting out with ragel, but it's possible you want to look at EOF actions or Error actions, executed respectively when the input ends or when the next character satisfies no transition from the current state.