Data Augmentation using Python

818 Views Asked by At

I'm currently working on a CNN related project where I'm a newbie in that particular area. I have like a set of images with 500 images on fabric defects. How can I increase the number of images like up to 2000? Any libraries that I can use on this?

2

There are 2 best solutions below

1
On

There are different data augmentation techniques like zooming, mirroring, rotating, cropping, etc. The idea is to create new images from your initial set of images so that model has to take into account new information caused by these changes.

Several librairies allow to do that, the first one is OpenCV, then you can use Keras on top of Tensorflow which provides a built-in high level functiton for data generation, or scikit-image.

I would recommend to start with simple and efficient techniques like mirroring and random cropping, and continue with color or contrast augmentation.

Documentation and articles:

2
On

The go-to libary for image augmentation is imgaug.

The documentation is self explaining but here is an example:


import numpy as np
from imgaug import augmenters as iaa
from PIL import Image

# load image and convert to matrix
image = np.array(Image.open("<path to image>"))

# convert image to matrix
# image must passed into a list because you can also put a list of multiple images into the augmenter, but for this demonstration we will only take one.
image = [image]

# all these augmentation techniques will applied with a certain probability
augmenter = iaa.Sequential([
    iaa.Fliplr(0.5), # horizontal flips
    iaa.Crop(percent=(0, 0.1)), # random crops

    iaa.Sometimes(
        0.5,
        iaa.GaussianBlur(sigma=(0, 0.5))
    ),

    iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5),

], random_order=True) # apply augmenters in random order

augmented_image = augmenter(images=image)

augmented_image is now a list with which contains one augmented image of the original. Since you said you want to create 2000 from 500 images you can do the following: You augment each image 4 times, ie like this:


total_images = []
for image_path in image_paths:
    image = Image.load(image_path)

    # create a list with for times the same image
    images = [image for i in range(4)]
    
    # pass it into the augmenter and get 4 different augmentations
    augmented_images = augmenter(images=images)
    
    # add all images to a list or save it otherwise
    total_images += augmented_images