Tuesday, August 8, 2017

Using Bottleneck Features for Multi-Class Classification in Keras and TensorFlow

Training an Image Classification model - even with Deep Learning - is not an easy task. In order to get sufficient accuracy, without overfitting requires a lot of training data. If you try to train a deep learning model from scratch, and hope build a classification system with similar level of capability of an ImageNet-level model, then you'll need a dataset of about a million training examples (plus, validation examples also). Needless to say, it's not easy to acquire, or build such a dataset practically.

So, is there any hope for us to build a good image classification system ourselves?

Yes, there is!

Luckily, Deep Learning supports an immensely useful feature called 'Transfer Learning'. Basically, you are able to take a pre-trained deep learning model - which is trained on a large-scale dataset such as ImageNet - and re-purpose it to handle an entirely different problem. The idea is that since the model has already learned certain features from a large dataset, it may be able to use those features as a base to learn the particular classification problem we present it with.

This task is further simplified since popular deep learning models such as VGG16 and their pre-trained ImageNet weights are readily available. The Keras framework even has them built-in in the keras.applications package.

An image classification system built with transfer learning
An image classification system built with transfer learning


The basic technique to get transfer learning working is to get a pre-trained model (with the weights loaded) and remove final fully-connected layers from that model. We then use the remaining portion of the model as a feature extractor for our smaller dataset. These extracted features are called "Bottleneck Features" i.e. the last activation maps before the fully-connected layers in the original model). We then train a small fully-connected network on those extracted bottleneck features in order to get the classes we need as outputs for our problem.

Sunday, July 30, 2017

Need More Fonts on OpenCV?

OpenCV has a built-in simple function to add text on your images - the cv2.putText() function. With just one line of code, you can add text anywhere on the image. You just need to specify the position, colour, scale (font size), and which the font to use as the minimum parameters.

 cv2.putText(image,  
           text_to_show,  
           (20, 40),  
           fontFace=cv2.FONT_HERSHEY_SIMPLEX,  
           fontScale=1,  
           color=(255, 255, 255))  

OpenCV also gives you a choice from a handful of fonts - all variants of the "Hershey" font.

But, there may come a point where you want more fonts. Have you wished that you could use a specific True Type or Open Type font on OpenCV?

The good news is, it's possible.

True Type Fonts working on OpenCV
True Type Fonts working on OpenCV

Friday, July 21, 2017

Snapchat like Image Overlays with Dlib, OpenCV, and Python

You're probably familiar with Snapchat, and it's filters feature where you can put some cool and funny image overlays on your face images. As computer vision enthusiasts, we typically look at applications like these, and try to understand how it's done, and whether we can build something similar.

It turns out, we can make our own application with Snapchat like image overlays using Python, OpenCV, and Dlib.

Snapchat like Image Overlays with Dlib, OpenCV, and Python
Snapchat like Image Overlays with Dlib, OpenCV, and Python

So, how do we build it?
  1. We'll first load the Webcam feed using OpenCV.
  2. We'll load an image (in our example, and image for the 'eye') to be used as the overlay.
  3. Use Dlib's face detection to localize the faces, and then use facial landmarks to find where the eyes are.
  4. Calculate the size and the position of the overlay for each eye.
  5. Finally, place the overlay image over each eye, resized to the correct size.

Let's start.

Tuesday, July 11, 2017

Codes of Interest Facebook Community is now Live!

We are on Facebook!

The Codes of Interest Page is now live on Facebook. I created the page so that our community can come together to share ideas, discuss about questions, quickly address issues you face with your Deep Learning / Machine Learning and Computer Vision experiments, and talk about what you would like to see from the Codes of Interest site.

The Codes of Interest Facebook Page

... and start discussing.

Encog-Node: Simple Machine Learning on Node.js

Before I got into serious Machine Learning and Computer Vision coding (which I mostly use Python for), I did a lot of my development on Node.js. Few years back (around 2012), I was trying to add a simple neural network to one of my Node.js applications. I looked around, but couldn't find a satisfactory node module which was lightweight and flexible. Around that time, I came across the Encog Machine Learning framework, which was created by Jeff Heaton, and was one of the most popular Machine Learning libraries for Java at the time. I noticed that there was a Javascript version of the Encog library, which worked surprisingly well, and set myself on to porting that to Node.js.

I released the first version of Encog-Node in early 2012, and the latest version v0.3.0 is now available from NPM - https://www.npmjs.com/package/encog-node, and is recommended for anyone who wants to add lightweight, simple machine learning capabilities to their Node.js applications.

GitHub user Rui Cardoso contributed a lot for the latest release, with restructuring and cleaning up the codebase, and adding more examples.

You can install it by simply running,
 npm install encog-node  
in your node application.

Friday, July 7, 2017

Milestones of Deep Learning

Deep Learning has been around for about a decade now. We talked about how Deep Learning evolved through Artificial Intelligence, and Machine Learning (See "What is Deep Learning?"). Since its inception, Deep Learning has taken the world by storm due to its success. Here are some of the more significant achievements of Deep Learning throughout the years,

AlexNet - 2012


The AlexNet Architecture
The AlexNet Architecture (Image from the research paper)

  • Proved that Convolutional Neural Networks actually works. AlexNet - and its research paper "ImageNet Classification with Deep Convolutional Neural Networks" by Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton - is commonly considered as what brought Deep Learning in to the mainstream.
  • Won 2012 ILSVRC (ImageNet Large-Scale Visual Recognition Challenge) with 15.4% error rate. (For reference, the 2nd best entry at ILSVRC had 26.2% error rate).
  • 8 layers: 5 convolutional, 3 fully connected.
  • Used ReLU for the non-linearity function rather than the conventional tanh function used until then.
  • Introduced the use of Dropout Layers, and Data Augmentation to overcome overfitting.
Research Paper: ImageNet Classification with Deep Convolutional Neural Networks - Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton

Wednesday, June 28, 2017

Machine UI : An IDE for Machine Learning, currently in Alpha

Machine UI, or just "Machine" as it's commonly referred, is an IDE for Machine Learning, which is currently in its Alpha stage. It has been designed to work with TensorFlow, and aims at simplifying setting up machine Learning experiments so that you spend more time experimenting, and less time configuring.


The interface of Machine UI
The interface of Machine UI (Note: This is a screenshot from their announcement video)

As per their announcement video, the machine learning experiments are set up visually. The input data, convolutions, and the outputs are placed as nodes on a graph. You can think of it as a more interactive version of the Tensor Board which comes with TensorFlow.