I have this parser:
class Parser
%%{
machine test_lexer;
action s { s = p; puts "s#{p}" }
action e { e = p; puts "e#{p}" }
action captured {
puts "captured #{s} #{e}"
}
key_value = "a" %s ("b" | "x" "c")+ %e %captured;
tags = ("x"+)? key_value;
main := tags*;
}%%
def initialize(data)
data = data
eof = data.length
%% write data;
%% write init;
%% write exec;
end
end
Parser.new(ARGV.first)
And I hit it with abxc then why does it call the captured twice / the e twice, and how can I prevent this ?
ragel -R simple.rl && ruby simple.rb "abxc"
s1
e2
captured 1 2
e4
captured 1 4
on github: https://github.com/grosser/ragel_example
Here is the diagram for your machine, BTW: http://bit.do/stackoverflow-19621544 (created with Erdos).
With "abxc" the
("b" | "x" "c")+machine first matches the "b" and then the "xc". When transitioning from "b" (to "x") it calls the leaving actions (eandcaptured) for the first time, and when transitioning from "xc" (to EOF) it calls the leaving actions (eandcaptured) for the second time.I guess the
eaction is supposed to set the end pointer in order to capture the string between startsand ende. If so, then Ragel calling theeaction multiple times isn't really a problem, you just advance the end pointer like you already do.