Keras – 7. Introduce the CIFAR 10 and Convolutional, Pooling Layer

Welcome to CS With James

In this tutorial I will discuss about what is CIFAR 10 Dataset and how we are going to use that data while learning the machine learning.

MNIST Dataset is the great dataset to work with. Simple classification and Gray Scale Image to make the learning easier. However, With the few Dense layers with the ReLU Activation function we already achieved more than 90% of the accuracy.

Therefore, we need something more complicated.

CIFAR 10 is consist of 10 classes, which are airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck and it is 32×32 size compare to MNIST 28×28 the size got little bigger however, it is a colored image, so actually it is 32x32x3

CIFAR 10 Website

Code for CIFAR10 with Dense Layers

Pretty much everything is same except the Dataset.

After train the Network, here is the result

33.59% is very bad result and which means basically it is not working

Now, we know that colored and complex images are not able to classified using only Dense layers, so now Introduce Convolutional and pooling layer.

Convolutional Layer is the layer that extract the features from the image and pooling layer helps to reduce the size of the output image. We are going to use Conv2D and MaxPooling2D layers, which works best with the images.

Code for Conv, Pooling Layer

Now, we are going to feed the data as an image, so we are not going to do the Flatten the image. Therefore, that part of code is removed, instead this is a colored image so there is channel parameter which is going to be 3 because this dataset uses RGB.

model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

This is the code to add the Convolutional and pooling layer. Two layers work great with the pair.

Conv2D(32, (3, 3), padding='same', activation='relu')
32 is number of filters in the output
(3,3) is the filter size
padding=’same’ will keep the image size same as the input size

For example, if the input is 32x32x3 then the output from the Conv2D layer is going to be 32x32x32 because there are 32 filter. From now on it is not image anymore, it will be new data structure called Tensor. 

model.add(MaxPooling2D(pool_size=(2, 2)))
pool_size=(2,2) will reduce the image size by half

Due to the previous Conv2D layer, the input to this layer is going to be 32x32x32 and this layer will reduce the image size by half, so output tensor size will be 16x16x32. The number of filter does not reduce.

Convolutional layer works great with the images but it requires lots of computational power, so it will takes long time with the CPU, so if you have Nvidia GPU then it will best to utilize your GPU for computation. I have posted tutorial how to install tensorflow-gpu version on your Mac.

Here is the result for training with convolutional layer

The result is 66.85% accuracy, which is not perfect but compare to 33.59% there are lots of improvement.


One thought on “Keras – 7. Introduce the CIFAR 10 and Convolutional, Pooling Layer”

Leave a Reply

Your email address will not be published. Required fields are marked *