I have implemented a naive Viterbi algorithm for a HMM based signal that I observe. The execution time of the decoder seems to be too slow for my requirement. I'm now trying to understand how to speed up the execution. When I'm to determine the computational complexity of the algorithm, I see that it's mentioned as having complexity of t * s^2, where t is number of observations and s is the number of states. I have roughly, 3500 states, and 100 observations. Each state has 729 emission probabilities.
I also see that it's mentioned in this paper, that Viterbi decoding is exponential in this paper (2^k, where k is the constraint length). I'm not understanding this explanation that well. But, I believe if Viterbi is exponential with regarding to states, then surely the algorithm would be very slow, even though I parallelize it.
My questions are:
- What is the complexity of the Viterbi algorithm/decoding? Are they the same in both instances?
- How do I make modifications to the Viterbi algorithm to speed it up?
EDITS: I'm implementing it in C++, hoping to modify it and parallelize it in the future.
To answer the first question:
If you have t observations, s states, and each state has e emission probabilities, then the trellis will have
t*s
nodes, and to evaluate each node will cost e operations, so the overall complexity of a naive implementation will beO(t*s*e)
.Viterbi decoding can be used to decode sequences of bits. If the observation depends on the previous k binary bits, then the number of different sequences of k bits is 2^k. This represents the number s of states you would need to do a stream decoding (each state represents one configuration of previous bits). However, this is unlikely to be relevant to you.
The paper you link to describes an approach which reduces the number of nodes which need to be expanded. This will not improve the worst case complexity, but may well give significant improvements in typical use depending on the nature of your specific problem.