loss function returns Nan in join mode of CRF keras-contrib

848 Views Asked by At

I use a BiLSTM-CRF architecture to assign some labels to a sequence of the sentences in a paper. We have 150 papers each of which contains 380 sentences and each sentence is represented by a double array with size 11 in range (0,1) and the number of class labels is 11.

input = Input(shape=(None,11))
mask = Masking (mask_value=0)(input)
lstm = Bidirectional(LSTM(50, return_sequences=True))(mask)
lstm = Dropout(0.3)(lstm)
lstm= TimeDistributed(Dense(50, activation="relu"))(lstm)
crf = CRF(11 , sparse_target=False , learn_mode='join')  # CRF layer
out = crf(lstm)
model = Model(input, out)
model.summary()
model.compile('adam', loss=crf.loss_function, metrics=[crf.accuracy])

I use keras-contrib package to implement CRF layer. CRF layer has two learning modes: join mode and marginal mode. I know that join mode is a real CRF that uses viterbi algorithm to predict the best path. While, marginal mode is not a real CRF that uses categorical-crossentropy for computing loss function. When I use marginal mode, the output is as follows:

Epoch 4/250:  - 6s - loss: 1.2289 - acc: 0.5657 - val_loss: 1.3459 - val_acc: 0.5262

But, in the join mode, the value of loss function is nan:

Epoch 2/250 :  - 5s - loss: nan - acc: 0.1880 - val_loss: nan - val_acc: 0.2120

I do not understand why this happens and would be grateful to anybody who could give a hint.

0

There are 0 best solutions below