Processing very large images as numpy arrays

1.3k Views Asked by At

I am dealing with images greater than 8 GB in .svs format. Using openslide, I have read them as 1D numpy arrays. Now in order to feed them into an algorithm I need to reshape them into image form for processing the pixel location related information. Since the images are very large, using PIL to convert numpy array as

image=np.load('test.npy') 
im=Image.fromarray(image)

is throwing me an error size does not fit in int. I tried to workaround this error by changing the dtype from uint8 to uint64 but, my python keeps on crashing despite having 64GB RAM and 3 TB memory on my workstation.

Then I tried to load numpy array using memmap:

    im = np.load(curr_path)
    shapeIm=im[:].shape  ##shape of the image
    name_no_ext = os.path.splitext(f[i])[0]
    filename=path.join(dir,name_no_ext+'.tif') ##filename to save the image file
    #Create a memmap with dtype and shape that matches our data:
    fp = np.memmap(filename, dtype='uint8', mode='w+',shape=shapeIm)  #memmap to read/write very large image files in chunks directly from disk
    #Write data to memmap array:
    fp[:] = im[:]
    fp.filename == path.abspath(filename)
    #Deletion flushes memory changes to disk before removing the object:
    del fp
    #Load the memmap and verify data was stored:
    newfp = np.memmap(filename, dtype='uint8', mode='r+', shape=shapeIm)

Now the above code is giving me an image in .tif format. But, I cannot process it. I couldn't analyse why? I found that when I tried to read that image and print its shape.

AttributeError: 'NoneType' object has no attribute 'shape'

So, This way also failed for me. Then I tried reshaping numpy array in the shape of the image which is (44331, 64625, 3), and I got the following error

ValueError: sequence too large; cannot be greater than 32

Can anyone help me how to process such image. I have annotations of these images in x,y,z pixel locations and to process these annotations as ground truth, I need to convert my numpy array in the form an image.

Any help would be great.

Edit: I got reshaping numpy array working now. But, still do not know how to use numpy files as my dataset input instead of images.

0

There are 0 best solutions below