I'm writing Ragel machine for rather simple binary protocol, and what I present here is even more simplified version, without any error recovery whatsoever, just to demonstrate the problem I'm trying to solve.
So, the message to be parsed here looks like this:
<1 byte: length> <$length bytes: user data> <1 byte: checksum>
Machine looks as follows:
%%{
machine my_machine;
write data;
alphtype unsigned char;
}%%
%%{
action message_reset {
/* TODO */
data_received = 0;
}
action got_len {
len = fc;
}
action got_data_byte {
/* TODO */
}
action message_received {
/* TODO */
}
action is_waiting_for_data {
(data_received++ < len);
}
action is_checksum_correct {
1/*TODO*/
}
len = (any);
fmt_separate_len = (0x80 any);
data = (any);
checksum = (any);
message =
(
# first byte: length of the data
(len @got_len)
# user data
(data when is_waiting_for_data @got_data_byte )*
# place higher priority on the previous machine (i.e. data)
<:
# last byte: checksum
(checksum when is_checksum_correct @message_received)
) >to(message_reset)
;
main := (msg_start: message)*;
# Initialize and execute.
write init;
write exec;
}%%
As you see, first we receive 1 byte that represents length; then we receive data
bytes until we receive needed amount of bytes (the check is done by is_waiting_for_data
), and when we receive next (extra) byte, we check whether it is a correct checksum (by is_checksum_correct
). If it is, machine is going to wait for next message; otherwise, this particular machine stalls (I haven't included any error recovery here on purpose, in order to simplify diagram).
The diagram of it looks like this:
$ ragel -Vp ./msg.rl | dot -Tpng -o msg.png
As you see, in state 1, while we receiving user data, conditions are as follows:
0..255(is_waiting_for_data, !is_checksum_correct),
0..255(is_waiting_for_data, is_checksum_correct)
So on every data byte it redundantly calls is_checksum_correct
, although the result doesn't matter at all.
The condition should be as simple: 0..255(is_waiting_for_data)
How to achieve that?
How is
is_checksum_correct
supposed to work? Thewhen
condition happens before the checksum is read, according to what you posted. My suggestion would be to check the checksum insidemessage_received
and handle any error there. That way, you can get rid of the secondwhen
and the problem would no longer exist.It looks like semantic conditions are a relatively new feature in Ragel, and while they look really useful, maybe they're not quite mature enough yet if you want optimal code.