Showing posts with label Troubleshooting. Show all posts
Showing posts with label Troubleshooting. Show all posts

Tuesday, September 22, 2020

Using model.fit() instead of fit_generator() with Data Generators - TF.Keras

If you have been using data generators in Keras, such as ImageDataGenerator for augment and load the input data, then you would be familiar with the using the *_generator() methods (fit_generator(), evaluate_generator(), etc.) to pass the generators when trainning the model. 

But recently, if you have switched to TensorFlow 2.1 or later (and tf.keras), you might have been getting a warning message such as,

Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.

Or,

Model.evaluate_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.evaluate, which supports generators.


fit_generator() Deprecation Warning
fit_generator() Deprecation Warning

This is because in tf.keras, as well as the latest version of multi-backend Keras, the model.fit() function can take generators as well. 

Wednesday, January 1, 2020

Fixing the KeyError: 'acc' and KeyError: 'val_acc' Errors in Keras 2.3.x

Have you been using the 'History' object returned by the fit() functions of Keras to graph or visualize the training history of your models? And have you been getting a 'KeyError' type error such as the following since recent Keras upgrade and wondering why?


Traceback (most recent call last):
  File "lenet_mnist_keras.py", line 163, in <module>
    graph_training_history(history)
  File "lenet_mnist_keras.py", line 87, in graph_training_history
    plt.plot(history.history['acc'])
KeyError: 'acc'

The KeyError: 'acc' when attempting to read the history object
The KeyError: 'acc' when attempting to read the history object


Traceback (most recent call last):
  File "lenet_mnist_keras.py", line 163, in <module>
    graph_training_history(history)
  File "lenet_mnist_keras.py", line 88, in graph_training_history
    plt.plot(history.history['val_acc'])
KeyError: 'val_acc'

The KeyError: 'val_acc' when attempting to read the history object
The KeyError: 'val_acc' when attempting to read the history object

Well, this is due to a breaking change introduced in Keras release 2.3.0.

Thursday, August 23, 2018

Cleaning up your Anaconda installations

If you've been using Anaconda Python for a while, and been creating multiple environments and adding/removing packages, you may have noticed that it's starting to take up a lot of disk space (sometimes tens of GBs).

Anaconda installation can get big
Anaconda installation can get big


One reason is that anaconda environments are completely isolated workspaces from each other with their own copy of Python. So, the more environments you have, the larger the space needed by anaconda. But the other reason is that anaconda keeps a cache of the package files, tarballs etc. of the packages you've installed. This is great when you need to reinstall the same packages. But, over time, the space can add up.

So, how do we clean up this cache and regain some disk space?

Tuesday, July 10, 2018

Failed Attempts at Building TensorFlow GPU from Source

For the last 3 weeks, I've been trying to build TensorFlow from source. I wanted to get TensorFlow GPU version working on Windows with CUDA 9.2 and cuDNN 7.1. Since the pre-built wheels only work with CUDA 9.0, the only way we can get it working with 9.2 is to build it ourselves from source.

The Windows build of TensorFlow is done using CMake. The official instructions are here: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/cmake

Unfortunately, as I found out after multiple attempts, the build process is not as simple as it sounds.
Every attempt I have made to build it failed so far.

But, I decided to post the steps I took - which didn't work - so that you all may be able use it as a reference if you decided to try it out yourselves. Again, note that these steps did not work.

First, I started with gathering all the dependencies to build on Windows 10:
  • Visual Studio 2015 Community Edition With Update 3 (14.0.25431.01) with C++
  • Anaconda Python 3.6.5
  • Git for Windows 2.18.0
  • Swigwin 3.0.12
  • CUDA Toolkit 9.2
  • cuDNN 7.1
  • CMake 3.11.3

Monday, May 14, 2018

Fixing the Matplotlib PyPlot import errors

About a week back, I was reinstalling Keras, TensorFlow and all the other libraries after a reformat of my PC. When I started verifying the library installations, I came across a strange error. When I tried to run a simple deep learning model, Python runtime crashed. As soon as I execute the script I was getting the "python.exe has stopped working" error message (I'm using Windows 10).

Matplotlib is an important part of my deep learning workflows
Matplotlib is an important part of my deep learning workflows
A little bit of debugging narrowed down the error to the following line in my script.

import matplotlib.pyplot as plt

(I was using matplotlib to graph the training history of the model. See "How to Graph Model Training History in Keras")

The error did not occur if I simply import matplotlib. It only occurred when specifically importing the pyplot module.

import matplotlib
# no errors

import matplotlib.pyplot as plt
# crash!!!

This is a known issue due to some library conflicts in the installation, which should hopefully be fixed in a future release. Until then, if you're getting this error, you can fix it by following the steps below.

Wednesday, September 27, 2017

Migrating a Model to Keras 2.0

Keras v2.0 has been released for a couple of months now - v2.0.0 released on 5th May, 2017, while the latest version is 2.0.8 at the time of this writing. It brought in a lot of new features and improvements, but also made some syntax changes. Trying to run a code with the old syntax may result in anything from a flood of deprecation warnings, to not being able to run the code at all. Since there are many code examples online which uses the older syntax - including some older posts in Codes of Interest - it's better to know how to get such older syntax model to work on the 2.0 API.

The complete list of changes in Keras v2.0 was extensive, but the following list would help you to narrow down majority of the changes.

The most prominent change is the changing of image_dim_ordering parameter to image_data_format, and its associated values from "tf", and "th" to "channels_last" and "channels_first". We talked about this change in detail in our earlier post "What is the image_data_format parameter in Keras, and why is it important".

Likewise, in all the places where "dim_ordering" argument/parameter was used, it has been changed to "data_format".

All of the Convolution* layers have now need renamed to Conv*.
E.g. Convolution2D is renamed to Conv2D

Saturday, September 9, 2017

What is the image_data_format parameter in Keras, and why is it important

We've talked about the image_dim_ordering parameter in Keras and why is it important. But since from Keras v2 changed the name of the parameter, I thought of bringing this up again.

As you know, Keras  is a higher-level neural networks library for Python, which is capable of running on top of TensorFlow, CNTK (Microsoft Cognitive Toolkit), or Theano, (and with limited support for MXNet and Deeplearning4j), which Keras refers to as 'Backends'.

The 'image_data_format' parameter in the keras.json file
The 'image_data_format' parameter in the keras.json file
Which backend Keras should use is defined in the keras.json file, which is located at ~/.keras/keras.json in Linux and Mac OS, and at %USERPROFILE%\.keras\keras.json on Windows.

The default keras.json file (default set to TensorFlow) would look like this,
 {  
   "epsilon": 1e-07,  
   "floatx": "float32",  
   "image_data_format": "channels_last",  
   "backend": "tensorflow"  
 }  
The "backend" parameter should either be "tensorflow", "cntk", or "theano". When switching the backend, make sure to switch the "image_data_format" parameter too. For "tensorflow "or "cntk" backends, it should be “channels_last”. For “theano”, it should be “channels_first”.

Tuesday, May 9, 2017

image_data_format vs. image_dim_ordering in Keras v2

If you have been using Keras for some time, then you would probably know the image_dim_ordering parameter of Keras. Specially, if you switch between TensorFlow and Theano backends frequently when using Keras.

When I first started using Keras for image classification, most of my experiments failed because I have set the image_dim_ordering incorrectly. Learning from my mistakes, last year I did a post on what image_dim_ordering is and why is it important.

The keras.json file houses the configuration options for Keras
The keras.json file houses the configuration options for Keras


In short, image_dim_ordering instructed Keras to properly rearrange the image data structure when passing to the backend:
Both TensorFlow and Theano expects 4D tensors of image data as input. But, while TensorFlow expects its structure/shape to be (samples, rows, cols, channels), Theano expects it to be (samples, channels, rows, cols). So, setting the image_dim_ordering to 'tf' made Keras use the TensorFlow ordering, while setting it to 'th' made it Theano ordering.

At least, that's how it used to work.

But recently, if you have updated to the latest version of Keras, you might have run into issues with the dimension ordering, even if you're sure that you set the image_dim_ordering correctly.

You may have gotten errors like,
 ValueError: The shape of the input to "Flatten" is not fully defined (got (0, 7,  
  50). Make sure to pass a complete "input_shape" or "batch_input_shape" argument  
  to the first layer in your model.  

It may seem to you that Keras has started to ignore your image_dim_ordering setting.

And you're right.

Sunday, February 26, 2017

How to solve CNMEM_STATUS_OUT_OF_MEMORY error with Theano on CUDA

Have yo come across the CNMEM_STATUS_OUT_OF_MEMORY error when using Theano with CUDA, with Keras? You might have been trying to train a slightly larger model, and just when the training starts it throws this error and fails.

The CNMEM_STATUS_OUT_OF_MEMORY thrown in Theano with CUDA
The CNMEM_STATUS_OUT_OF_MEMORY thrown in Theano with CUDA

The full error stack looks something like this,

Tuesday, February 14, 2017

How to solve Scikit-learn Deprecation Warning on cross_validation

When using the Scikit-learn library and trying out various examples found over the web, have you come across a DeprecationWarning for the cross_validation module?

The DeprecationWarning on cross_validation
The DeprecationWarning on cross_validation
This most commonly happens when the code you're trying to run utilizes the train_test_split() function - a handy function used to quickly split the training and test datasets from a main dataset. The full warning message is something like this,

 C:\Users\Thimira\Anaconda3\envs\tensorflow12\lib\site-packages\sklearn\cross_val  
 idation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in  
  favor of the model_selection module into which all the refactored classes and f  
 unctions are moved. Also note that the interface of the new CV iterators are dif  
 ferent from that of this module. This module will be removed in 0.20.  
  "This module will be removed in 0.20.", DeprecationWarning)  

So, how to solve this?