I'm currently trying to ImageDataGenerator() to create a dataset for xray images with pneumonia (kaggle dataset)
Here is what I have:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
#data path
train_data_path = 'chest_xray/train'
test_data_path = 'chest_xray/test'
val_data_path = 'chest_xray/val'
# test
test_generator = ImageDataGenerator().flow_from_directory(
test_data_path,
target_size=(64, 64), batch_size = 624)
# train
train_generator = ImageDataGenerator().flow_from_directory(
train_data_path,
target_size=(64, 64), batch_size = 5215)
#val
val_generator = ImageDataGenerator().flow_from_directory(
val_data_path,
target_size=(64, 64), batch_size = 16)
Now for the problem:
#datasets
test_images, test_labels = next(test_generator)
train_images, train_labels = next(train_generator)
I'm running into this error:
UnidentifiedImageError Traceback (most recent call last)
<ipython-input-10-c2d92e76516a> in <cell line: 1>()
----> 1 train_images, train_labels = next(train_generator)
4 frames
/usr/local/lib/python3.10/dist-packages/PIL/Image.py in open(fp, mode, formats)
3281 warnings.warn(message)
3282 msg = "cannot identify image file %r" % (filename if filename else fp)
-> 3283 raise UnidentifiedImageError(msg)
3284
3285
UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7f0f40441a30>
Neither dataset is created with the code - I've ran them separately and they are hit with the same error. I've tried looping through the images to find the image causing the error with this:
import PIL
from pathlib import Path
from PIL import UnidentifiedImageError
path = Path("/content/drive/MyDrive/FlatironProjects/Phase-4/chest_xray/train").rglob("*.jpeg")
for img_p in path:
try:
img = PIL.Image.open(img_p)
except PIL.UnidentifiedImageError:
print(img_p)
However, ALL images run through correctly. Please help!