Python - hmmlearn - Negative transmat

949 Views Asked by At

I'm trying to fit a model with hmmlearn given a transition matrix and an emisison matrix a priori. After fit, it gives some negative values in the transition matrix.

The transition matrix is recovered by the transition matrix of another model.

A code of example of what I'm meaning is:

>>> model
GaussianHMM(algorithm='viterbi', covariance_type='diag',covars_prior=0.01,
  covars_weight=1, init_params='stmc', means_prior=0, means_weight=0,
  n_components=3, n_iter=100, params='stmc', random_state=123,
  startprob_prior=1.0, tol=0.5, transmat_prior=1.0, verbose=True)
>>> model.transmat_
array([[  9.95946216e-01,   2.06359396e-21,   4.05378401e-03],
   [  2.05184679e-21,   9.98355526e-01,   1.64447392e-03],
   [  3.86689326e-03,   1.96383373e-03,   9.94169273e-01]])
>>> new_model= hmm.GaussianHMM(n_components=model.n_components,
random_state=123,
... init_params="mcs", transmat_prior=model.transmat_)

>>> new_model.fit(train_features)
GaussianHMM(algorithm='viterbi', covariance_type='diag', covars_prior=0.01,
      covars_weight=1, init_params='mcs', means_prior=0, means_weight=0,
      n_components=3, n_iter=10, params='stmc', random_state=123,
      startprob_prior=1.0, tol=0.01,
      transmat_prior=array([[  9.95946e-01,   2.06359e-21,   4.05378e-03],
       [  2.05185e-21,   9.98356e-01,   1.64447e-03],
       [  3.86689e-03,   1.96383e-03,   9.94169e-01]]),
      verbose=False)
>>> new_model.transmat_
array([[  9.98145253e-01,   1.86155258e-03,  -7.08313729e-06],
       [  2.16330448e-03,   9.93941859e-01,   3.89483667e-03],
       [ -5.44842863e-06,   3.52862069e-03,   9.96478546e-01]])
>>> 

In the code shown training data are also the same. If I don't use the transition matrix in priori but the emission, for example, it works correctly. I'm using Anaconda 2.5 64-bit. hmmlearn version is 0.2.0

Hint? Thanks

1

There are 1 best solutions below

0
On BEST ANSWER

tl;dr ensure transmat_prior is >=1.

EM algorithm for hidden Markov models is derived using state indicator variables z which hold the state of the Markov chain for each time step t. Conditioned on the previous state z[t - 1], z[t] follows a Categorical distribution with parameters defined by the transition probability matrix.

hmmlearn implements MAP learning of hidden Markov models, which means that each model parameter has a prior distribution. Specifically, each row of the transition matrix is assumed to follow a symmetric Dirichlet distribution with parameter transmat_prior. The choice of prior is not random, Dirichlet distribution is conjugate to the Categorical. This gives rise to a simple update rule in the M-step of EM algorithm:

transmat[i, j] = (transmat_prior[i, j] - 1.0 + stats["trans"][i, j]) / normalizer

where stat["trans"][i, j] is the expected number of transitions between i and j.

From the update rule it's clear that transition probabilities can get negative if a) transmat_prior is <1 for some i and j and b) the expectation stats["trans"] is not big enough to compensate for this.

This is a known issue in MAP estimation of the Categorical distribution and the general advice is to require that transmat_prior >=1 for all states.