Converting image folder to numpy array is consuming the entire RAM

354 Views Asked by iustin At 08 August 2019 at 14:56

I am trying to convert the celebA dataset(https://www.kaggle.com/jessicali9530/celeba-dataset) images folder into a numpy array for later to be converted into a .pkl file(for using the data as simply as mnist or cifar).

I am willing to find a better way of converting since this method is absolutely consuming the whole RAM.

from PIL import Image
import pickle
from glob import glob
import numpy as np

TARGET_IMAGES = "img_align_celeba/*.jpg"

def generate_dataset(glob_files):
   dataset = []
   for _, file_name in enumerate(sorted(glob(glob_files))):
       img = Image.open(file_name)
       pixels = list(img.getdata())
       dataset.append(pixels)
   return np.array(dataset)

celebAdata = generate_dataset(TARGET_IMAGES)

I am rather curious on how the mnist authors did this themselves but any approach that works is welcome.

Original Q&A

There are 1 best solutions below

bugo99iot On 09 August 2019 at 12:27 BEST ANSWER

You can transform any kind of data on the fly in Keras and load in memory one batch at the time during training. See documentation, search for 'Example of using .flow_from_directory(directory)'.

Converting image folder to numpy array is consuming the entire RAM

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in NUMPY

Related Questions in DATASET

Trending Questions

Popular # Hahtags

Popular Questions