I've got efficiency problems with viterbi logodds computation in Matlab.
Basically my problem is that it is mandatory to have nested loops which slows the code down a lot. This is the expensive part:
for i=1:input_len
for j=1:num_states
v_m=emission_value+max_over_3_elements; %V_M
v_i=max_over_2_elements; %V_I
v_d=max_over_2_elements; %V_D
end
end
I believe I'm not the first to implement viterbi for profile HMMs so maybe you've got some advice. I also took a look into Matlab's own hmmviterbi but there were no revelations (also uses nested loops). I also tested replacing max with some primitive operations but there was no noticeable difference (was actually a little slower).
Unfortunately, loops just are slow in Matlab (it gets better with more recent versions though) - and I don't think it can be easily vectorized/parallelized as the operations inside the loops are not independent on other iterations.
This seems like a task for MEX - it should not be too much work to write this in C and the expected speedup is probably quite large.