Separating Train, Validation and Test set using ImageDataGenerator from keras for a CNN

Question

Separating Train, Validation and Test set using ImageDataGenerator from keras for a CNN

45 Views Asked by Shanti At 02 April 2025 at 06:22

So I have already separated a priori the train, validation and test set (this is how the data came).

And I have folders for each one of them like this:

Test

 Class1 

 Class0

Val

Class1

Class0

Train

Class1

Class0

Then I defined the paths as follows:

# Define paths

train_dir = os.path.join(PATH, 'train')
val_dir = os.path.join(PATH, 'val')
test_dir = os.path.join(PATH, 'test')

# Specify them by class

train_safe_dir = os.path.join(train_dir, 'class1')  
train_malicious_dir = os.path.join(train_dir, 'class0')  
val_safe_dir = os.path.join(val_dir, 'class1')  
val_malicious_dir = os.path.join(val_dir, 'class0')
test_safe_dir = os.path.join(test_dir, 'class1')  
test_malicious_dir = os.path.join(test_dir, 'class0')

Then I used the ImageDataGenerator as follows:

train_datagen = ImageDataGenerator(rescale=1./255)
val_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(batch_size=batch_size,
                                                directory=train_dir,
                                                shuffle=False,
                                                target_size=(IMG_H, IMG_W),
                                                class_mode='binary')
val_generator = val_datagen.flow_from_directory(batch_size=batch_size,
                                            directory=val_dir,
                                            target_size=(IMG_H, IMG_W),
                                            class_mode='binary')
test_generator = test_datagen.flow_from_directory(batch_size=batch_size,
                                                directory=test_dir,
                                                shuffle=False,
                                                target_size=(IMG_H, IMG_W),
                                                class_mode='binary')

Is this correct for when I want to evaluate on the test data? Am I doing data leakage somehow? If it's not correct, what would be the right approach to cope with the test data? Thank you so much!

I'm not sure if I should have a folder with the test data without the class separation but when I tried that I got a really low accuracy that didn't make sense. Any suggestion is appreciated!

results = CNN.evaluate(test_generator, batch_size=64)

Original Q&A

There are 1 best solutions below

**Mojtaba Abdi Khassevan** · Answer 1

Actually you have some problems with your code.

1)First of all, the lines below

train_safe_dir = os.path.join(train_dir, 'class1')  
train_malicious_dir = os.path.join(train_dir, 'class0')  
val_safe_dir = os.path.join(val_dir, 'class1')  
val_malicious_dir = os.path.join(val_dir, 'class0')
test_safe_dir = os.path.join(test_dir, 'class1')  
test_malicious_dir = os.path.join(test_dir, 'class0')

are of no use. You might want to delete them from your code. They are redundant.

2)Secondly, you must have your test data folder organized for test and validation data as is the case for the traning data. Just bear in mind, when you have tabular data. Do you discard the output labels for the test data in order to evaluate it? The folders in the directory play the role of class labels.

3)Thirdly, your low accuracy originates from this fact that you haven't done any image augmentation on your training dataset. So, it is not surprising that your model has overfitted the training data.

and fortunately, you don't leak any information from the training data to validation and test data.

Separating Train, Validation and Test set using ImageDataGenerator from keras for a CNN

There are 1 best solutions below

Related Questions in TESTING

Related Questions in KERAS

Related Questions in CONV-NEURAL-NETWORK

Related Questions in TEST-DATA

Related Questions in IMAGEDATAGENERATOR

Trending Questions

Popular # Hahtags

Popular Questions