how to put numpy array entirely on RAM using numpy memmap?

652 Views Asked by At

I would like to use a memmap allocated numpy array that can be processed in parallel using joblib i.e. shared memory between different processes. But I also want the big array to be stored entirely on RAM to avoid the write/read to disk that memmap does. I have enough RAM to store the whole array, but using np.zeros() instead of memmap complicates parallelization since the former allocates memory local to a process. How do I achieve my goal?

Example:

x_memmap = os.path.join(folder, 'x_memmap')
x_shared = np.memmap(x_memmap,dtype=np.float32,shape=(100000,8,8,32),mode='w+')

Later:

n = N / number_of_cores
slices = [ slice((id*n) , (min(N,(id+1)*n))) for id in range(number_of_cores) ]
Parallel(n_jobs=number_of_cores)( delayed(my_job) ( x_shared[sl,:] ) for sl in slices )

If I allocate x_shared with np.zeros instead as shown below, I can't use parallelization.

x_shared = np.zeros(dtype=np.float32,shape=(100000,8,8,32))
0

There are 0 best solutions below