Saturday, November 5, 2016

What is the image_dim_ordering parameter in Keras, and why is it important

Update 9/May/2017: With Keras v2, the image_dim_ordering parameter has been renamed to image_data_format. Check my updated post on how to configure it.

If you remember my earlier post about switching Keras between TensorFlow and Theano backends, you would have seen that we switched the image_dim_ordering parameter also when switching the backend. For TensorFlow, image_dim_ordering should be "tf", while for Theano, it should be "th".

So, what is this parameter, and where does it affect?

It has to do with how each of the backends treat the data dimensions when working with multi-dimensional convolution layers (such as Convolution2D, Convolution3D, UpSampling2D, Copping2D, … and any other 2D or 3D layer). Specifically, it defines where the 'channels' dimension is in the input data.

Both TensorFlow and Theano expects a 4 dimensional tensor as input. But where TensorFlow expects the 'channels' dimension as the last dimension (index 3, where the first is index 0) of the tensor – i.e. tensor with shape (samples, rows, cols, channels) – Theano will expect 'channels' at the second dimension (index 1) – i.e. tensor with shape (samples, channels, rows, cols). The outputs of the convolutional layers will also follow this pattern.

So, the image_dim_ordering parameter, once set in ~/.keras/keras.json, will tell Keras which dimension ordering to use in its convolutional layers. 

However, if you like to override the dimension ordering programmatically, you do it by using the dim_ordering parameter when initializing a convolutional layer:
 model = Sequential()  
   
 model.add(Convolution2D(64, 3, 3, border_mode='same', input_shape=(3, 256, 256), dim_ordering='th'))  

The dim_ordering parameter is available in all the multi-dimensional convolution layers.

Related posts:
image_data_format vs. image_dim_ordering in Keras v2

Related links:
https://keras.io/layers/convolutional/#convolution2d


Build Deeper: Deep Learning Beginners' Guide is the ultimate guide for anyone taking their first step into Deep Learning.

Get your copy now!

2 comments:

  1. "TensorFlow expects the 'channels' dimension to be at index 4 of the tensor – i.e. tensor with shape (samples, rows, cols, channels) – Theano will expect 'channels' at index 1 – i.e. tensor with shape (samples, channels, rows, cols)"

    At the risk of sounding pedantic, it seems to me that you're using two different indexing techniques (base 1 and base 0). I believe you mean to say one of the following:

    a) TF uses index 4, and Theano uses index 2
    b) TF uses index 3, and Theano uses index 1

    ReplyDelete
    Replies
    1. Yes, you're right. It sounds confusing.
      I have now updated it to (hopefully) make it clear.

      Thanks,

      Delete