Chunkwise memmap updates causing exploding memory usage only for last chunk

25 Views Asked by At

I am trying to perform random predictions to find baseline F1 scores and confirm them with the calculated scores. I am using memmaps for dealing with large numbers using the following code:

    c,size = 120,10**5
    rng = np.memmap('combs.npy',dtype='u1',mode='w+',shape=(c*(c-1)//2))
    combinations = np.array([(i,j) for i in range(c) for j in range(i+1,c)])
    n = combinations.reshape(-1,2).shape[0]
    true = np.memmap('true_before.npy',dtype='u1',mode='w+',shape=(n*size,2))
    pred = np.memmap('pred_before.npy',dtype='u1',mode='w+',shape=(n*size,2))
    chunk = 10**3
    for i in range(size//chunk):
        true[(n*chunk*i):(n*chunk*(i+1))] = np.memmap.repeat(combinations.reshape(-1,2),chunk,axis=0)
    for i in range(size//chunk):
        pred[(n*chunk*i):(n*chunk*(i+1))] = combinations[np.random.choice(range(n),n*chunk)]

The code runs with 2 gigs of ram usage just before it reaches the last chunk where the ram usage explodes and give out of memory error. This doesn't make sense, since the ram usage is constant for the previous chunks. Is this happening due to some caveat of python loops?

0

There are 0 best solutions below