Python: handling large numbers

138 Views Asked by Petr Petrov At 16 December 2018 at 17:53

I need to count perplexity and I try to do it with

def get_perplexity(test_set, model):
    perplexity = 1
    n = 0
    for word in test_set:
        n += 1
        perplexity = perplexity * 1 / get_prob(model, word)
    perplexity = pow(perplexity, 1/float(n))
    return perplexity

And after some steps my perplexity is equal to infinity. I need to get number and as last step to do pow(perplexity, 1/float(n))

Is any to multiply numbers like and get result?

3.887311155784627e+243
8.311806360146177e+250
1.7707049372801292e+263
1.690802669602979e+271
3.843294667766984e+278
5.954424789834101e+290
8.859529887856071e+295
7.649470766862909e+306

Original Q&A

There are 2 best solutions below

alexgolec On 16 December 2018 at 18:15

The repeated multiplication is going to cause some tricky numerical instability as the results of your multiplications require more and more bits to represent. I propose you translate this into log-space and use summation rather than multiplication:

import math

def get_perplexity(test_set, model):
    log_perplexity = 0
    n = 0
    for word in test_set:
        n += 1
        log_perplexity -= math.log(get_prob(model, word))
    log_perplexity /= float(n)
    return math.exp(log_perplexity)

This way all your logarithms can be represented in the standard number of bits, and you don't get numerical blowups and loss of precision. Also, you can introduce an arbitrary degree of precision by using the decimal module:

import decimal

def get_perplexity(test_set, model):
    with decimal.localcontext() as ctx:
        ctx.prec = 100  # set as appropriate
        log_perplexity = decimal.Decimal(0)
        n = 0
        for word in test_set:
            n += 1
            log_perplexity -= decimal.Decimal(get_prob(model, word))).ln()
        log_perplexity /= float(n)
        return log_perplexity.exp()

Ryabchenko Alexander On 16 December 2018 at 18:23

since e+306 is just 10^306 you can make class of two parts

class BigPowerFloat:
    POWER_STEP = 10**100
    def __init__(self, input_value):
        self.value = float(input_value)
        self.power = 0

    def _move_to_power(self):
        while self.value > self.POWER_STEP:
            self.value = self.value / self.POWER_STEP
            self.power += self.POWER_STEP
        # you can add similar for negative values           


    def __mul__(self, other):
        self.value *= other
        self._move_to_power()

    # TODO other __calls for /, +, - ...

    def __str__(self):
        pass
        # make your cust to str

Python: handling large numbers

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in OPTIMIZATION

Related Questions in LARGENUMBER

Related Questions in PERPLEXITY

Trending Questions

Popular # Hahtags

Popular Questions