Saturday, February 17, 2018

Using Data Augmentations in Keras

When I did the article on Using Bottleneck Features for Multi-Class Classification in Keras and TensorFlow, a few of you asked about using data augmentation in the model. So, I decided to do few articles experimenting various data augmentations on a bottleneck model. As a start, here's a quick tutorial explaining what data augmentation is, and how to do it in Keras.

The idea of augmenting the data is simple: we perform random transformations and normalization on the input data so that the model we’re training never sees the same input twice. With little data, this can greatly reduce the chance of the model overfitting.

But, trying to manually add transformations to the input data would be a tedious task.

Which is why Keras has built-in functions to do just that.

The Keras Preprocessing package has the ImageDataGeneraor function, which can be configured to perform the random transformations and the normalization of input images as needed. And, coupled with the flow() and flow_from_directory() functions, can be used to automatically load the data, apply the augmentations, and feed into the model.

Let’s write a small script to see the data augmentation capabilities of ImageDataGeneraor.


I'll use the following image,

The image we'll be using for the data augmentation.
The image we'll be using.
Image Source: Wikimedia - By Mopower82 (Own work) [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons

We use the following code to load the image, run augmentations on it 20 times, and save the resulting augmented images.


from keras.preprocessing.image import ImageDataGenerator, img_to_array, load_img

datagen = ImageDataGenerator(
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest')

img = load_img('data/Car.jpg')  # this is a PIL image

# convert image to numpy array with shape (3, width, height)
img_arr = img_to_array(img)

# convert to numpy array with shape (1, 3, width, height)
img_arr = img_arr.reshape((1,) + img_arr.shape)

# the .flow() command below generates batches of randomly transformed images
# and saves the results to the `data/augmented` directory
i = 0
for batch in datagen.flow(
    img_arr,
    batch_size=1,
    save_to_dir='data/augmented',
    save_prefix='Car_A',
    save_format='jpeg'):
    i += 1
    if i > 20:
        break  # otherwise the generator would loop indefinitely

We used the following parameters to augment our image,

  • rotation_range – the range (degrees) within which to apply random rotations to the images.
  • width_shift_range – the range within which to apply random horizontal shifts.
  • height_shift_range – the range within which to apply random vertical shifts.
  • shear_range – the range within which to apply random shearing transformations.
  • zoom_range – the range within which to apply random zooming to the images. 
  • horizontal_flip – whether to apply random horizontal flips to the images.
  • fill_mode='nearest' – the method of which the newly created pixels are filled.

ImageDataGeneraor has several more parameters for augmentations. You can read about them in the official documentation page.

The flow() function of the ImageDataGeneraor is able to take in the input images, apply the augmentations we defined and produce batches of augmented data indefinitely on a loop. While in this example we only have one input image, the fit() function is really meant for batches of images.




The resulting augmented images are saved into the data/augmented directory, and they would look something like this (the transformations are random, so your results might differ),

The augmented images
The augmented images

Using data augmentations like these, we should be able to reduce the chance of a deep learning model overfitting when training on a small dataset.

Related posts:
Using Bottleneck Features for Multi-Class Classification in Keras and TensorFlow





Build Deeper: The Path to Deep Learning
Learn the bleeding edge of AI in the most practical way: By getting hands-on with Python, TensorFlow, Keras, and OpenCV. Go a little deeper...
Get your copy now!

1 comment:

  1. I want to do exactly the same but with an entire directory consisting of images. How do I do that?

    ReplyDelete