I need to count perplexity and I try to do it with
def get_perplexity(test_set, model):
perplexity = 1
n = 0
for word in test_set:
n += 1
perplexity = perplexity * 1 / get_prob(model, word)
perplexity = pow(perplexity, 1/float(n))
return perplexity
And after some steps my perplexity
is equal to infinity.
I need to get number and as last step to do pow(perplexity, 1/float(n))
Is any to multiply numbers like and get result?
3.887311155784627e+243
8.311806360146177e+250
1.7707049372801292e+263
1.690802669602979e+271
3.843294667766984e+278
5.954424789834101e+290
8.859529887856071e+295
7.649470766862909e+306
The repeated multiplication is going to cause some tricky numerical instability as the results of your multiplications require more and more bits to represent. I propose you translate this into log-space and use summation rather than multiplication:
This way all your logarithms can be represented in the standard number of bits, and you don't get numerical blowups and loss of precision. Also, you can introduce an arbitrary degree of precision by using the
decimal
module: