Friday, December 9, 2016

What is a Tensor?

Whether you are new to the field of Deep Learning, or might be coming from bit of a Machine Learning background and just dipping your toes into Deep Learning frameworks, you might be coming across the word "Tensor". Specially, with the increasing popularity of TensorFlow - the 2nd generation Machine Learning library by the Google Brain Team - the concept of the "Tensor" has also gained popularity.

You may be thinking that a Tensor is probably a mysterious and complicated thing, used exclusively in Machine Learning algorithms. But in reality, a Tensor is something quite simple and common.

So, what is a Tensor?

A representation of a Tensor
Image courtesy: Wikipedia -

A Tensor is simply a data structure. A much more generic representation of any of the data collections we see in programming languages. However, unlike any other data structures, typically the position of a value in a Tensor has a meaning (we'll get to that in a bit).

A Tensor may have zero or more dimensions, which is given as the 'Order' of a Tensor.

A scalar, or a single number is a Tensor of order zero, which can also be thought as an array of dimension zero.

A Tensor of order one would be a Vector, or a one-dimensional array.

A Tensor of order two would be a matrix, or a two-dimensional array.

And so on... Tensors of any order can be made.

As I mentioned above, in typical use cases, the position of a value in a tensor - commonly referred to as the coordinates - has a meaning, and the value would transform according to the rules we define when the coordinates are changed.

For an example, we can define a tensor of order two which contains the pixel values of a grayscale image. In this case, the coordinates of the values of the tensor would represent the X and Y coordinates of the pixels of the image, and the values themselves would be the grayscale values at those coordinates. And, as you expect, if you transform the coordinates, the values would also change.

Related links:

Saturday, November 19, 2016

Setting up Keras and Anaconda Python on Ubuntu 16.10

I’ve been using Anaconda Python for most of my Machine Learning experiments, mainly because of the flexibility it gives with the isolated Python environments. I recently did a post on how to install Keras on Anaconda on Windows.

I’m planning to switch to Linux for few of my experiments, so I decided to try out setting up Anaconda Python and Keras from scratch on Ubuntu. I’ll be using the latest Ubuntu 16.10 (Yakkety Yak) 64-Bit for this.

Note: The screenshots I captured are from a virtual machine with Lubuntu 16.10 (the LXDE flavor of Ubuntu). But the steps and commands are exactly the same for the standard Ubuntu desktop as well.

First and foremost, get and install the latest updates in Ubuntu, (Reboot the machine if necessary after updating.)
 sudo apt-get update  
 sudo apt-get upgrade  

Then, we’ll install the following necessary packages,
 sudo apt-get install build-essential cmake git unzip pkg-config  
 sudo apt-get install libopenblas-dev liblapack-dev  

Now, on to installing Anaconda. Head over to the Anaconda Python Downloads page, and get the Linux installer for Anaconda. We’ll be getting the Python 3.5 64-Bit package.
Go to the Anaconda Download page and download the Anaconda Python 3.5 64-Bit package for Linux
Download the Anaconda Python 3.5 64-Bit package for Linux

This will download a file named (the version numbers might be different based on the latest version available at the time of the download).

Saturday, November 12, 2016

Getting the LeNet model working with Face Recognition

In my last post, I talked about how the LeNet Convolutional Neural Network model is capable of handling much more complex data than the intended MNIST dataset. We saw how it got ~99% accuracy when it learned to identify 10 faces from the raw pixel intensities.

So, let’s see the code I used to get it working.

First of all, I needed a training dataset. For that, I created a set of face images of 10 subjects with around 500 images each.

Few of the images from the training dataset
The training dataset (yep, that's my face)

I use a file naming convention as <subject_label>-<subject_name>-<unique_number>.jpg (e.g. 0-Thimira-1475137898.65.jpg) for the training images to make it easier to read in and get the metadata of the images in one go. (I will do a separate post on how to easily create training datasets of face images like this).

We'll mainly be using Keras to build the model, and scikit-learn for some utility functions. We’ll need to import the following packages,
 from sklearn.cross_validation import train_test_split  
 from keras.optimizers import SGD  
 from keras.utils import np_utils  
 import numpy as np  
 import argparse  
 import cv2  
 import os  
 import sys  
 from PIL import Image  

Monday, November 7, 2016

Can the LeNet model handle Face Recognition?

I recently followed a blog post - at PyImageSearch by Adrian Rosebrock - on using the LeNet Convolutional Neural Network model on the MNIST dataset - i.e. use for handwritten digit recognition - using Keras with Theano backend. I was able to easily try it out thanks to the very detailed and well thought out guide.

The LeNet model itself is quite simple, just 5 layers. Yet it performs impressively well on the MNIST dataset. We can get around 98% accuracy with just 20 iterations of training with ease.

The training time for the model is also quite low. I tested on my MSI GE60 2PF Apache Pro laptop with CUDA enabled, and the training time was just 2 minutes 20 seconds on average. On CPU only (with CUDA disabled) it took around 30 minutes.

LeNet giving 98% accuracy on MNIST data
LeNet giving 98% accuracy on MNIST data
As you can see, we got 98.11% accuracy, and it has correctly classified a digit that has been cut-off.

It even classifies a quite deformed '2' correctly.
LeNet correctly classifying a deformed digit
LeNet correctly classifying a deformed digit

Saturday, November 5, 2016

What is the image_dim_ordering parameter in Keras, and why is it important

Update 9/May/2017: With Keras v2, the image_dim_ordering parameter has been renamed to image_data_format. Check my updated post on how to configure it.

If you remember my earlier post about switching Keras between TensorFlow and Theano backends, you would have seen that we switched the image_dim_ordering parameter also when switching the backend. For TensorFlow, image_dim_ordering should be "tf", while for Theano, it should be "th".

So, what is this parameter, and where does it affect?

It has to do with how each of the backends treat the data dimensions when working with multi-dimensional convolution layers (such as Convolution2D, Convolution3D, UpSampling2D, Copping2D, … and any other 2D or 3D layer). Specifically, it defines where the 'channels' dimension is in the input data.

Both TensorFlow and Theano expects a 4D tensor as input. But where TensorFlow expects the 'channels' dimension to be at index 4 of the tensor – i.e. tensor with shape (samples, rows, cols, channels) – Theano will expect 'channels' at index 1 – i.e. tensor with shape (samples, channels, rows, cols). The outputs of the convolutional layers will also follow this pattern.

So, the image_dim_ordering parameter, once set in ~/.keras/keras.json, will tell Keras which dimension ordering to use in its convolutional layers. 

However, if you like to override the dimension ordering programmatically, you do it by using the dim_ordering parameter when initializing a convolutional layer:
 model = Sequential()  
 model.add(Convolution2D(64, 3, 3, border_mode='same', input_shape=(3, 256, 256), dim_ordering='th'))  

The dim_ordering parameter is available in all the multi-dimensional convolution layers.

Related posts:
image_data_format vs. image_dim_ordering in Keras v2

Related links:

Thursday, November 3, 2016

Difference between Artificial Intelligence, Machine Learning, and Deep Learning

Update: Check out the new and updated article on What is Deep Learning, and how it relates to Artificial Intelligence and Machine Learning.


You may have heard the terms Artificial Intelligence, Machine Learning, Deep Learning and you maybe trying to figure out what they mean, and whether these terms can be used interchangeably.

I've also had the same questions when I started diving in to the field. And a recent post in the Nvidia Blog brought back the question.

So here’s a simplified explanation on how each of those terms came to be, and how they relate to each other.

Artificial Intelligence


Artificial Intelligence is the idea that machines (or computers) can be built that has intelligence parallel (or greater) to that of a human, giving them capability to perform tasks that requires human intelligence to perform.

The idea of an intelligent machine has been around since 1300 BC, and through 19th century. But the Dartmouth Conferences in 1956 is what’s commonly considered as the starting point of the formal research field of Artificial Intelligence. Since then the field of AI has gone through many ups-and-downs and has branched out into many sub fields. There has been attempts at applying AI for various fields – such as medical, finance, aviation, machinery etc. – with various degrees of success.

Around the late 1990s and early 2000s, the researchers identified a problem in their approach to AI, which was slowing down the success of AI – in order for us to artificially crate a machine with an intelligence, we would first need to understand how intelligence work. But even today, we do not have a complete definition of what we call "intelligence".

In order to tackle the problem, they decided to go ground-up – rather than trying to build an intelligence, we could look in to building a system that can grow its own intelligence. This idea created the new sub-field of AI called Machine Learning.

Tuesday, November 1, 2016

Switching between TensorFlow and Theano on Keras

Keras speeds up the task of building Neural Networks by providing high-level simplified functions to create and manipulate neural models. It itself does not provide the lower level neural and deep learning functions, but it’s rather meant to be run on an engine – which Keras refers to as a “backend” - which would provide such low-level functions.

Currently, Keras supports two such backends – TensorFlow and Theano.

The current version of Keras (v1.1.0 at the time of this writing) uses TensorFlow by default.

Most models written on top of Keras can be switched to a different backend without changes – at least it’s what’s said in the documentation. I’m yet to test this.

Which backend Kesas will use is defined in the Keras config file, which is located in the .keras directory in your home directory:
e.g.: on linux it would be ~/.keras/keras.json and on windows you can get to it on %USERPROFILE%\.keras\keras.json

For the default of using the TensorFlow backend, use the following config,
   "image_dim_ordering": "tf",  
   "epsilon": 1e-07,  
   "floatx": "float32",  
   "backend": "tensorflow"  

Notice the "backend" is set to "tensorflow" and "image_dim_ordering" is set to "tf".

To use the Theano backend, use the following,
   "image_dim_ordering": "th",   
   "epsilon": 1e-07,   
   "floatx": "float32",   
   "backend": "theano"  

Apart from the obvious "backend": "theano", note that "image_dim_ordering" is set to "th".

See my new post to see what the image_dim_ordering parameter in Keras does, and why is it important to set it properly.

Update: If you use Jupyter notebooks, and need to switch between TensorFlow and Theano backends quite often, fellow blogger desertnaut has a solution to dynamically switch the backend. Check out his solution at: Dynamically switch Keras backend in Jupyter notebooks

Related posts:
What is the image_dim_ordering parameter in Keras, and why is it important

Related links:

Saturday, October 29, 2016

Getting Dlib Face Landmark Detection working with OpenCV

Dlib has excellent Face Detection and Face Landmark Detection algorithms built-in. Its face detection is based on Histogram of Oriented Gradients (HOG) feature combined with a linear classifier, on a sliding window detection scheme (Ref. and it provides pre-trained models for face landmark detection. It also provides handy utility functions like dlib.get_frontal_face_detector() to make our lives easier.

Dlib Face Landmark Detection in action
Dlib Face Landmark Detection in action
Note: Image used for testing is in the Public Domain -

To check out Dlib with it's native functions, you can try out the Dlib example from the official site: works well, but we can do better.

Although Dlib offers all the simplicity in implementing face landmark detection, it's still no match for the flexibility of OpenCV. (Simply put, Dlib is a library for Machine Learning, while OpenCV is for Computer Vision and Image Processing)

So, can we use Dlib face landmark detection functionality in an OpenCV context? Yes, here's how.

Tuesday, October 25, 2016

Installing Dlib on Anaconda Python on Windows

Updated: 03/Jul/2017

Dlib is a Machine Learning library, primarily written in C++, but has a Python package also. It has many useful and optimized algorithms useful for machine learning, linear algebra, data structures, image processing and many more available out-of-the-box.
"Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. Dlib's open source licensing allows you to use it in any application, free of charge." -
One of the most popular features in Dlib is the Facial Landmark Detection. Dlib installation ships with a pre-trained shape predictor model named shape_predictor_68_face_landmarks.dat, which as the name suggests, is trained to detect 68 facial keypoints including eyes, eyebrows, mouth, nose, face outline etc.

Dlib's Facial Landmark Detection is action
Dlib's Facial Landmark Detection is action
You can view the sample code for face landmark detection here at the Dlib website, and download the pre-trained model from:
Of course, Dlib is capable of much more than face landmark detection. I'm hoping to dig in to some cool features of Dlib in later posts.

But first, we need to install it.

Getting Keras working with Anaconda Python

I've started using the Anaconda Python distribution for most of my Machine Learning. It has pre-built binaries of Python for many platforms and architectures, has hundreds of pre-built and tested Python packages directly available through the conda package manager, and it allows easy creation of virtual isolated environments - with its own Python version and packages - to experiment with.

You can get an idea of the capabilities of Anaconda by going through their Anaconda Test Drive guide.

Getting Keras (with Theano backend) working on any Python distribution is usually straightforward, but you do run into some errors occasionally based on the platform you're on and your environment settings.

So, here are the steps that worked for me to get Keras working on the Anaconda Python distribution:

First, you need to install Anaconda. It's as easy as getting the binary for your platform from Anaconda download page and running it. Once it's installed, the conda command will be available from your terminal or command prompt.

Now you can create an anaconda environment to install Keras and related packages,
 conda create --name keras-test numpy scipy scikit-learn pillow h5py mingw libpython  

'keras-test' is the name of the environment we're creating. You can give it a different name.
You can also create an environment with a different Python version. For example, if you want to create the environment with Python 2.7,
 conda create --name keras-test python=2.7 numpy scipy scikit-learn pillow h5py mingw libpython  

Once the environment is created, activate it.
 activate keras-test  

Then, we'll install Theano from Git, since we want the latest development version,
 pip install --upgrade --no-deps git+git://  

And then, we install Keras from PIP,
 pip install keras  

Finally, we setup OpenBLAS and configure Theano to use it. My earlier blog post - Getting Theano working with OpenBLAS on Windows - details how to setup Theano with OpenBLAS in detail.

We can test whether the setup was successful by running the Python interpreter and importing Keras package,
 >>> import keras  
 Using Theano backend.  

Keras loading successfully
Keras loading successfully

If you don't get any errors when the Keras package is loading, then all is set.

Related posts:
Switching between TensorFlow and Theano on Keras
What is the image_dim_ordering parameter in Keras, and why is it important

Related Links:

Friday, October 21, 2016

Working Theano configs

Here are the Theano configurations that I have tested and worked.
These were tested on Windows 10 64-Bit, and Windows 7 64-Bit.
(I will update when I test on other OS's and setups)

With GPU support, on CUDA and cuDNN

In order to allow Theano to use the GPU, you need to be on a machine with a supported Nvidia GPU, and have the CUDA toolkit and cuDNN setup. I will cover how to setup CUDA on a different post.

 floatX = float32  
 device = gpu  
 compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin  
 enabled = True  
 ldflags=-LC:\Dev_Tools\openblas\bin -lopenblas  

device = gpu tells Theano to use the GPU instead of the CPU.
flags=-LC:\Users\Thimira\Anaconda3 point this to your Python installation (I'm using Anaconda Python)
compiler_bindir=C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\bin point this to the bin dir of your Visual Studio installation (Note: CUDA only worked with Visual Studio 2013 for me)
[dnn] enabled = True this enables cuDNN
cnmem=0.75 set the memory limit Theano can use of the GPU. Here it's set to 75% of the GPU memory
ldflags=-LC:\Dev_Tools\openblas\bin -lopenblas point to your OpenBLAS installation. Refer to my earlier post Getting Theano working with OpenBLAS on Windows

With only CPU support

Since not everyone have a compatible Nvidia GPU to have CUDA.

 floatX = float32  
 device = cpu  
 ldflags=-LC:\Dev_Tools\openblas\bin -lopenblas  

device = cpu tells Theano to use the CPU.
ldflags=-LC:\Dev_Tools\openblas\bin -lopenblas point to your OpenBLAS installation. Refer to my earlier post Getting Theano working with OpenBLAS on Windows

Thursday, October 20, 2016

Getting Theano working with OpenBLAS on Windows

I wanted to try out Machine Learning with Python, so my first choice was Keras with Theano.

Got Theano installed from Git (to get the latest development version):
 pip install --upgrade --no-deps git+git://  

Then, I needed to setup Theano with OpenBLAS (otherwise, training Keras models was painfully slow).
Since I was on Windows, I had to look around for instructions on how to setup OpenBLAS properly.

Luckily, OpenBLAS provides binaries for Windows - both 32Bit and 64Bit - although, they may not be for the latest version of OpenBLAS.

Head over to and see which release has the binaries already built for Windows. We need both OpenBLAS and MinGW binaries.

At the time of this writing the latest version of OpenBLAS was v0.2.19, which unfortunately doesn't have the Windows binaries released.

But, going back a few releases, we find that the release v0.2.15 includes the binaries - and

Download both of the Zip files, and first extract the OpenBLAS Zip to a globally accessible location on your hard disk. (I would suggest a location such as C:\Dev_Tools\openblas\).
Then, extract the mingw Zip, and copy it's contents to the bin directory of your extracted OpenBLAS directory. e.g. If you extracted OpenBLAS to C:\Dev_Tools\openblas\, then copy the contents (3 DLL files) of mingw to C:\Dev_Tools\openblas\bin\.
i.e.: The extracted openblas\bin will have the libopenblas.dll in it. When you extract mingw, it will have 3 more DLLs - libgcc_s_seh-1.dll, libgfortran-3.dll, libquadmath-0.dll. Copy those to openblas\bin also.

Then, add the openblas\bin directory to your system path.

Finally, edit (or create) your .theanorc file with the following settings: (assuming you extracted OpenBLAS to C:\Dev_Tools\openblas\)
Note: If you don't already have a .theanorc file, create a file named .theanorc in the home directory of your user account, e.g. C:\Users\<your user>\.theanorc

 floatX = float32  
 device = cpu  
 ldflags=-LC:\Dev_Tools\openblas\bin -lopenblas  

Now, run your Keras/Theano program and see whether Theano picks up OpenBLAS.

Related Links: