problem applying augmentation on train data

38 Views Asked by At

I have image dataset of 102 images belong to 3 classes. I have stored the image_path, label, image (loaded using io so its np.array) in a dataframe. Split to train test val, then I'm trying to apply data augmentation to add it to the train data to increase it using the following :

#apply augmentation
target_size = (224, 224)

# Create lists to store augmented images and labels
augmented_path = []
augmented_images = []
augmented_labels = []

for image_path, label in zip(train_data['image_path'], train_data['label']):
    img = load_img(image_path, target_size=target_size)
    img_array = img_to_array(img) / 255.0  # Normalize the image
    img_array = img_array.reshape((1,)+img_array.shape)  # Add batch dimension
    label = np.array([label]).reshape(1, -1)  # Reshape label as needed

    # Generate augmented images and labels
    for _ in range(30):  # Define the number of augmentations per image
        augmented_image = train_datagen.flow(img_array, batch_size=1)
        augmented_label = label
        
        augmented_path.append(image_path)
        augmented_images.append(augmented_image[0])
        augmented_labels.append(augmented_label)

augmented_labels list output looks like this:

 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]]),
 array([[1]])

...and more 2130 row classes that ranges between [0 - 2]

The problem is when I apply augmentation it reshapes the labels and image column arrays to a a bigger shape .. I want it to stay the same as the original train data so I can concat them and they have the same shape

column shapes before augmentation original train data

column shapes after applying augmentation and increased the original train data

Also, what else can I do to prepare this data for a CNN classification model.

0

There are 0 best solutions below