When I did the article on Using Bottleneck Features for Multi-Class Classification in Keras and TensorFlow, a few of you asked about using data augmentation in the model. So, I decided to do few articles experimenting various data augmentations on a bottleneck model. As a start, here's a quick tutorial explaining what data augmentation is, and how to do it in Keras.
The idea of augmenting the data is simple: we perform random transformations and normalization on the input data so that the model we’re training never sees the same input twice. With little data, this can greatly reduce the chance of the model overfitting.
But, trying to manually add transformations to the input data would be a tedious task.
Which is why Keras has built-in functions to do just that.
The Keras Preprocessing package has the ImageDataGeneraor function, which can be configured to perform the random transformations and the normalization of input images as needed. And, coupled with the flow() and flow_from_directory() functions, can be used to automatically load the data, apply the augmentations, and feed into the model.
Let’s write a small script to see the data augmentation capabilities of ImageDataGeneraor.
Showing posts with label Classification. Show all posts
Showing posts with label Classification. Show all posts
Saturday, February 17, 2018
Tuesday, August 8, 2017
Using Bottleneck Features for Multi-Class Classification in Keras and TensorFlow
Training an Image Classification model - even with Deep Learning - is not an easy task. In order to get sufficient accuracy, without overfitting requires a lot of training data. If you try to train a deep learning model from scratch, and hope build a classification system with similar level of capability of an ImageNet-level model, then you'll need a dataset of about a million training examples (plus, validation examples also). Needless to say, it's not easy to acquire, or build such a dataset practically.
So, is there any hope for us to build a good image classification system ourselves?
Yes, there is!
Luckily, Deep Learning supports an immensely useful feature called 'Transfer Learning'. Basically, you are able to take a pre-trained deep learning model - which is trained on a large-scale dataset such as ImageNet - and re-purpose it to handle an entirely different problem. The idea is that since the model has already learned certain features from a large dataset, it may be able to use those features as a base to learn the particular classification problem we present it with.
This task is further simplified since popular deep learning models such as VGG16 and their pre-trained ImageNet weights are readily available. The Keras framework even has them built-in in the keras.applications package.
![]() |
| An image classification system built with transfer learning |
The basic technique to get transfer learning working is to get a pre-trained model (with the weights loaded) and remove final fully-connected layers from that model. We then use the remaining portion of the model as a feature extractor for our smaller dataset. These extracted features are called "Bottleneck Features" (i.e. the last activation maps before the fully-connected layers in the original model). We then train a small fully-connected network on those extracted bottleneck features in order to get the classes we need as outputs for our problem.
Friday, January 13, 2017
Stepping into audio classification - Getting started with PyAudio
I've had an idea to attempt to train a deep learning model that can classify different audio noises. Having a system that can accurately identify different sounds would have implications in many fields, from medical diagnostics to echo locations systems, among others. I wouldn't expect building such a system would be an easy task. But, let's give it a try.
I first had the idea about 8 years back. But back then, libraries that enable audio processing were scarce, and what was available required a massive effort to get installed. Now it seems that things have improved a lot. I myself was searching online for audio processing libraries for Python, and I came across this excellent post - Realtime Audio Visualization in Python - by Scott Harden of University of Florida.
The tutorial uses the PyAudio Python package (PyAudio homepage), which is the Python bindings for PortAudio (PortAudio homepage) - a cross-platform audio I/O library - allowing PyAudio to give a consistent interface to process audio across platforms. So, I decided to give PyAudio a try. First of all, I needed to install PyAudio.
At the time of this writing, the latest version of PyAudio is v0.2.9, and the PyAudio team has made the installation as simple as possible.
I first had the idea about 8 years back. But back then, libraries that enable audio processing were scarce, and what was available required a massive effort to get installed. Now it seems that things have improved a lot. I myself was searching online for audio processing libraries for Python, and I came across this excellent post - Realtime Audio Visualization in Python - by Scott Harden of University of Florida.
![]() |
| Realtime Audio Visualization with PyAudio |
The tutorial uses the PyAudio Python package (PyAudio homepage), which is the Python bindings for PortAudio (PortAudio homepage) - a cross-platform audio I/O library - allowing PyAudio to give a consistent interface to process audio across platforms. So, I decided to give PyAudio a try. First of all, I needed to install PyAudio.
Installing PyAudio
At the time of this writing, the latest version of PyAudio is v0.2.9, and the PyAudio team has made the installation as simple as possible.
Subscribe to:
Posts (Atom)

