Passing in training labels to tf.keras.preprocessing.image_dataset_from_directory doesn't work

732 Views Asked by At

I'm trying to load data into a Colab notebook, where the (flat) directory contains a bunch of jpg images, and the label classes are contained in a separate csv file, using tf.keras.preprocessing.image_dataset_from_directory.

According to the documentation:

Either "inferred" (labels are generated from the directory structure), or a list/tuple of integer labels of the same size as the number of image files found in the directory. Labels should be sorted according to the alphanumeric order of the image file paths (obtained via os.walk(directory) in Python).

I read the csv using pandas and converted it into a list using the following and passed train_labels in as the labels argument:

labels = pd.read_csv(_URL)
train_labels = labels.values[:,1].tolist()
print("Total labels:", len(train_labels))
print(train_labels)
>>> Total labels: 1164
>>> [1, 0, 1, 1, 1, 2, 0, ... ]
train_dataset = image_dataset_from_directory(directory=train_dir,
                                         labels=train_labels,
                                         label_mode='int',
                                         shuffle=True,
                                         batch_size=BATCH_SIZE,
                                         image_size=IMG_SIZE)

However, on running the cell, the output read:

Found 1164 files belonging to 1 classes.

Is there something wrong with the format of how I'm passing in the list of classes, or are there other settings which I need to make before the label classes can take effect?

0

There are 0 best solutions below