Showing posts with label Convolutional Neural Networks. Show all posts
Showing posts with label Convolutional Neural Networks. Show all posts

Wednesday, January 3, 2018

Visualizing the Convolutional Filters of the LeNet Model

First of all, Happy New Year to you all!

We have a great year ahead. And, let's start it with something interesting.

We've talked about how Convolutional Neural Networks (CNNs) are able to learn complex features from input procedurally through convolutional filters in each layer.

But, how does a convolutional filter really look like?

In today's post, let's try to visualize the convolutional filters of the LeNet model trained on the MNIST dataset (handwritten digit classification) - often considered the 'hello world' program of deep learning.

We can use a technique to visualize the filters from the article "How convolutional neural networks see the world" by François Chollet (the author of the Keras library). The original article is available at the Keras Blog: https://blog.keras.io/how-convolutional-neural-networks-see-the-world.html.

The original code is designed to work with the VGG16 model. Let’s modify it a bit to work with our LeNet model.

We need to load the LeNet model with its weights. You can follow the code here to train the model yourself and get the weights. Let's name the weights file as 'lenet_weights.hdf5'.

We'll start with the imports,

from scipy.misc import imsave
import numpy as np
import time
from keras import backend as K

from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Flatten
from keras.layers.core import Dense

from keras.optimizers import SGD

We need to build and load the LeNet model with the weights. So, we define a function - build_lenet - for it.

Tuesday, August 8, 2017

Using Bottleneck Features for Multi-Class Classification in Keras and TensorFlow

Training an Image Classification model - even with Deep Learning - is not an easy task. In order to get sufficient accuracy, without overfitting requires a lot of training data. If you try to train a deep learning model from scratch, and hope build a classification system with similar level of capability of an ImageNet-level model, then you'll need a dataset of about a million training examples (plus, validation examples also). Needless to say, it's not easy to acquire, or build such a dataset practically.

So, is there any hope for us to build a good image classification system ourselves?

Yes, there is!

Luckily, Deep Learning supports an immensely useful feature called 'Transfer Learning'. Basically, you are able to take a pre-trained deep learning model - which is trained on a large-scale dataset such as ImageNet - and re-purpose it to handle an entirely different problem. The idea is that since the model has already learned certain features from a large dataset, it may be able to use those features as a base to learn the particular classification problem we present it with.

This task is further simplified since popular deep learning models such as VGG16 and their pre-trained ImageNet weights are readily available. The Keras framework even has them built-in in the keras.applications package.

An image classification system built with transfer learning
An image classification system built with transfer learning


The basic technique to get transfer learning working is to get a pre-trained model (with the weights loaded) and remove final fully-connected layers from that model. We then use the remaining portion of the model as a feature extractor for our smaller dataset. These extracted features are called "Bottleneck Features" (i.e. the last activation maps before the fully-connected layers in the original model). We then train a small fully-connected network on those extracted bottleneck features in order to get the classes we need as outputs for our problem.

Friday, July 7, 2017

Milestones of Deep Learning

Deep Learning has been around for about a decade now. We talked about how Deep Learning evolved through Artificial Intelligence, and Machine Learning (See "What is Deep Learning?"). Since its inception, Deep Learning has taken the world by storm due to its success. Here are some of the more significant achievements of Deep Learning throughout the years,

AlexNet - 2012


The AlexNet Architecture
The AlexNet Architecture (Image from the research paper)

  • Proved that Convolutional Neural Networks actually works. AlexNet - and its research paper "ImageNet Classification with Deep Convolutional Neural Networks" by Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton - is commonly considered as what brought Deep Learning in to the mainstream.
  • Won 2012 ILSVRC (ImageNet Large-Scale Visual Recognition Challenge) with 15.4% error rate. (For reference, the 2nd best entry at ILSVRC had 26.2% error rate).
  • 8 layers: 5 convolutional, 3 fully connected.
  • Used ReLU for the non-linearity function rather than the conventional tanh function used until then.
  • Introduced the use of Dropout Layers, and Data Augmentation to overcome overfitting.
Research Paper: ImageNet Classification with Deep Convolutional Neural Networks - Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton

Thursday, May 18, 2017

What is Deep Learning? - Updated

What is Deep Learning? And, how does it relates to Machine Learning, and Artificial Intelligence?

I did an article to answer these questions some time back.

Now, thanks to the feedback I got from you all, I was able to updated it, with more clarifications, improved examples, and answers to more questions in Deep Learning.


Check out the updated article here,


Your feedback are always welcome.

Build Deeper: Deep Learning Beginners' Guide is the ultimate guide for anyone taking their first step into Deep Learning.

Get your copy now!

Thursday, April 13, 2017

How deep should it be to be called Deep Learning?

If you remember, some time back, I made an article on What is Deep Learning?, in which I explored the confusion that many have on terms Artificial Intelligence, Machine Learning, and Deep Learning. We talked about how those terms relate to each other: how the drive to build an intelligent machine started the field of Artificial Intelligence, when building an intelligence from scratch proved too ambitious, how the field evolved into Machine Learning, and with the expansion of both the capabilities of computer hardware and our understanding of the natural brain, dawned the field of Deep Learning

We learned that the deeper and more complex models (compared to traditional models) of Deep Learning are able to consume massive amounts of data, and able to learn complex features by Hierarchical Feature Learning through multiple layers of abstraction. We saw that Deep Learning algorithms don’t have a "plateau in performance" compared to traditional machine learning algorithms: that they don’t have a limit on the amount of data they can ingest. Simply, the more data they are given, the better they would perform.

The Plateau in Performance in Traditional vs. Deep Learning
The Plateau in Performance in Traditional vs. Deep Learning


With the capabilities of Deep Learning grasped, there’s one question that usually comes up when one first learns about Deep Learning:

If we say that deeper and more complex models gives Deep Learning models the capabilities to surpass even human capabilities, then how deep a machine learning model should be to be considered a Deep Learning model?

I’ve had the same question when I was first getting started with Deep Learning, and I had few other Deep Learning enthusiasts asking me the same question.

It turns out, we were asking the wrong question. We need to look at Deep Learning from a different angle to understand it.

Let’s take a step back and see how a Deep Learning model works.

Monday, November 7, 2016

Can the LeNet model handle Face Recognition?

I recently followed a blog post - at PyImageSearch by Adrian Rosebrock - on using the LeNet Convolutional Neural Network model on the MNIST dataset - i.e. use for handwritten digit recognition - using Keras with Theano backend. I was able to easily try it out thanks to the very detailed and well thought out guide.

The LeNet model itself is quite simple, just 5 layers. Yet it performs impressively well on the MNIST dataset. We can get around 98% accuracy with just 20 iterations of training with ease.

The training time for the model is also quite low. I tested on my MSI GE60 2PF Apache Pro laptop with CUDA enabled, and the training time was just 2 minutes 20 seconds on average. On CPU only (with CUDA disabled) it took around 30 minutes.

LeNet giving 98% accuracy on MNIST data
LeNet giving 98% accuracy on MNIST data
As you can see, we got 98.11% accuracy, and it has correctly classified a digit that has been cut-off.

It even classifies a quite deformed '2' correctly.
LeNet correctly classifying a deformed digit
LeNet correctly classifying a deformed digit

Saturday, November 5, 2016

What is the image_dim_ordering parameter in Keras, and why is it important

Update 9/May/2017: With Keras v2, the image_dim_ordering parameter has been renamed to image_data_format. Check my updated post on how to configure it.

If you remember my earlier post about switching Keras between TensorFlow and Theano backends, you would have seen that we switched the image_dim_ordering parameter also when switching the backend. For TensorFlow, image_dim_ordering should be "tf", while for Theano, it should be "th".

The keras.json file contains the Keras configuration options
The keras.json file contains the Keras configuration options


So, what is this parameter, and where does it affect?

It has to do with how each of the backends treat the data dimensions when working with multi-dimensional convolution layers (such as Convolution2D, Convolution3D, UpSampling2D, Copping2D, … and any other 2D or 3D layer). Specifically, it defines where the 'channels' dimension is in the input data.