Mnist Handwritten Digit Classification
Mnist Handwritten Digit Classification
Project Title
MNIST Handwritten Digit Classification using
Convolution Neural Networks
Presented by.
Raina Srivastava
(1712010072)
Under the guidance of
Rahul Chakravorty(Assistant Professor)
Department of Computer Science and Engineering
Institute of Technology and Management, Gida, Gorakhpur
Date-: 01/11/2020
• In the CNN model, the training and testing of the input image is performed by passing the
image through a series of convolution layers with filters. These filters are also known as
kernels.
• This is followed by the pooling layers for feature extraction, the dense layers and finally
‘softmax’ function is used for classifying the object within a probabilistic range values of 0
to 1.
Convolution Neural Networks
• Convolution Layer
The convolution layer is the first layer of the CNN model architecture which starts to
extract features from the input. The convolution layer operates with two inputs, one is
the image matrix while the other being the filter or kernel.
The convolution layer safeguards the relationship between the original image pixels
by learning the features of the image using small square patches.
In the figure, we have an image matrix of size 5x5 and a kernel size of
3x3. The image values are 0 and 1. Therefore to obtain a ‘feature map’ we multiply the
image matrix with the filter.
Convolution Neural Networks
Full padding – When padding of an image is done to increase the dimensions of the
output, which is why it is less used in convolution neural networks.
Same padding – it is done when the output size is required to be kept equal to the input
vector. It is quite common in terms of application.
Valid padding – it refers to no padding, and produces the output in decreased dimensions.
Convolution Neural Networks
After running the entire set of iterations for a specified
number of epochs, the model becomes able in
distinguishing between the important and the trivial
features of an image. In the end since this is multiclass
classification problem, we
apply a “Softmax” function to convert the vector values into a probability distribution
consisting of the same number of probabilities as the elements in the vector values into a
probability distribution consisting of the same number of probabilities as the elements in the
vector.
Ming Wu(2010) The goal is to train and test a set of • Clsssificaton of MNIST database for
classifiers for pattern analysis in pattern analysis in solving the
solving handwritten digit handwritten digit recognition
recognition problems, using MNIST problem
database. Direction features are
extracted for dimensionality
reduction.
Continue...
Author Name Description Parameter
Wan Zhu(2012) In this paper the experimental results • Backpropagation neural network
show that to a certain extent, the back- setting up
propagation neural network can be used • Evaluation Methods
to solve classification problem in the real • Modify original method with
world. However,the accuracy rate cannot autoencoder
be guaranteed and potential reasons are
discussed. Then we try to modify the
structure of the original neural network
into a CNN.
Fathma Siddique(2013) The goal of this paper work will be to • Modeling of Convolutional Neural
create a model that will be able to Network to classify HANDWRITTEN
identify and determine the handwritten DIGITS
digit from its image with better accuracy.
We aim to complete this by using the
concepts of CNN and MNIST
dataset.Through this work, the aim to
learn and practically apply the concepts
of CNN.
Continue...
Author Name Description Parameters
Md. Anwar Hossain (2019) The goal of this paper work will be • The Architecture of the
to create a model that will be able proposed model
to identify and determine the
handwritten digit from its image • Explanation of the Model
with better accuracy. We aim to
complete this by using the
concepts of CNN and MNIST
dataset. Through this work, the
aim to learn and practically apply
the concepts of Convolutional
Neural Networks.
Methodology
• We have prepared the handwritten digit recognition model through Convolution Neural Networks, using
‘Tensorflow” module of version 2.3.0 and all the pre-processing was performed through ‘Keras’ version
2.4.3.
• At first, we have imported all the necessary modules that we require to build the model. Keras provides us
with many predefined datasets, among which, the most important part is to import the dataset which we
have used in our model i.e., the “mnist” dataset of the handwritten digits.
import cv2
import numpy as np
from keras.datasets import mnist
from keras.layers import Dense, Flatten
from keras.layers.convolutional import Conv2D
from keras.models import Sequential
from keras.utils import to_categorical
from keras.layers import Dense, Dropout, Flatten
Methodology
• We segregate the imported dataset into training and test datasets for
evaluation.
(X_train, y_train), (X_test, y_test) = mnist.load_data()
• The dataset consists of a total number of 70000 images. Out of which, 60000 images were subjected for
training of the model and the rest 10000 images were kept for the test dataset.
• Then we display random images (between 0 to 9) from the dataset to have an idea of how the images
look. We use the following code snippet for the purpose
plt.imshow(X_train[x], cmap="gray")
plt.show()
• As we can make out, the images are of the digits ‘5’ and ‘2’
and the print (y_train[x]) clearly gives the original values.
Methodology
• Next, we need to make sure that we shape our input datasets so that there is complete
uniformity before training of the model. We have already set the number of images in the
training set to 60000 and to test set to 10000.
• We can also see the dimension of each image as 28x28x1 which signifies that the images
are square with same height and width and the 1 specifies that they are grayscale images.
• One Hot coding – the next task is to convert the input data into categorical format so that
it is easier to classify the digits. The final layer (dense layer) of the CNN model consists of
10 nodes, one for each of the digits ranging from 0 to 9. when a model is fed with an
image, it returns the probabilities of the digits according to the nodes.
Methodology
• The predicted output which we will be returned to the user is the node which will have
the highest probability. For example - if the first node has the highest probability then the
output will be ‘0’, similarly if the last node has the highest probability then the output will
be ‘1’.
• For the labels of one-hot encoding, since we have a total of ten possible outputs, the
model will allot ‘1’ for the node with the highest probability and the rest as ‘0’. For
example- if we have a predicted output of 3 then the labels after one hot encoding will
produce the output as [0,0,0,1,0,0,0,0,0,0].
• To implement this, we have used the code snippet from keras.utils import
to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
Methodology
• The Keras model type which we have used to build the model is ‘Sequential’. It helps to
build the model, one layer after the other, by using the add() function for adding the layers
one by one. ‘Sequential’ can be imported by using
from keras.models import Sequential
• The first two layers of the model are the convolution layers. Since the images to be dealt
are in form of 2D matrices, the layers are ‘Conv2D’ layers.
• In the first layer, there are 32 nodes and the second layer has 64 nodes, chosen on the
basis of the model. The size of the kernel used is being set to 3, which means a 3x3 filter
matrix will be used for the convolution.
• The activation function which has been used is ReLU. This function returns a 0 if the input
is negative or returns the same values i.e., ReLU -> max(0, input).
Methodology
• There is an activation function in the dense layer called “Softmax”, which is used for
getting the outputs in the form of probabilities ranging from 0 to
1. Therefore, while producing an output the algorithm tries to match the maximum
probability with the input data and consequently provides the prediction.
• Then we compile the models together using the parameters “optimizer” set to “adam” as
it is a very good optimiser as it combines features from both Stochastic gradient descent
and RMSProp.
• Loss function has been taken as “categorical_crossentropy” which is one of the widely
used functions for classification problem. Metrics has been chosen as “accuracy” to check
on the validation accuracy during training of the model.
model.compile(optimizer='adam',loss='categorical_crossentropy',
metrics=['accuracy'])
Methodology
• Next comes the training of the model with all the characteristics that we have specified.
The model will be trained on the X_train, y_train data frames and will be validated on the
X_test and y_test data frames.
• After multiple experimentation on the model, the optimum number of epochs was found
to be 15 which has been specified in the model.fit() parameter.
• By changing the dropout layer values, we needed to be sure that the model is not
overfitted. A round of 60 epochs with no dropout showed that the model was highly
overfitted.
• Therefore both the dropout layers were set to 0.15 and the results were
taken after 15 epochs.
Results
• The final validation accuracy which we have gained is 0.9894 i.e., 98.94%. Now we will
predict the images and see how well out model is performing.
• We have used a combination of three outputs together. In the first case, we are printing all
the probabilities which the program has calculated . We had ten different classes of letters,
and the one which has the highest probability should be the correct prediction.
• In the second case, the probabilities were converted again. The value of highest
probability has been set as “1” while the other probabilities set to “1”. And in the third
output, the corresponding image has been shown.
Results
We take single handwritten digits to check how well our model is performing. The images
were hand drawn on Microsoft Paint and saved in PNG format. The image is of a “three”
number.
Original Input
Conclusion
• The MNIST dataset is a widely used dataset for digital recognition in the field of data
science. We have used this dataset to train a deep learning model and recognize the ten
different digits of number system.
• The dataset consists of 70000 images, each of size 28x28 pixels. These images were then
converted to grayscale images and the new dimensions were 28x28x1 which were used as
sample data.
• The deep learning model which has been used is Convolution Neural Networks. We have
used a kernel size of 3 with MaxPooling of size 2x2 for extracting of the features. We have
used “softmax” function in the dense layer of the model to get the probabilities of the
predicted output.
• We have used hand-written digits of two types for testing of the model. The initial
examples are of digitalized hand written images prepared in Microsoft paint and then we
have used real hand written image. Both the times, the results were accurate and have
predicted correctly.
References
1. Andrea Giuliodori, “HANDWRITTEN DIGIT CLASSIFICATION “, “Departamento de Estadística
Universidad Carlos III de Madrid Calle Madrid”, June 2011 , Working Paper: 11-17 .
2. Hiroshi Sako Hiromichi Fujisawa Cheng-Lin Liu, Kazuki Nakashima, “Handwritten Digit Recognition:
benchmarking of state-of-the-art techniques”, “The journal of the pattern recognition society”,
36:2271–2285, 2003.
3. Wan Zhu, “Classification of MNIST Handwritten Digit Database using Neural Network “, Australian
National University, Acton, ACT 2601, Australia u6148896@anu.edu.au .
4. Md. Anwar Hossain, “Recognition of Handwritten Digit using Convolutional Neural Network (CNN) “,
“Global Journal of Computer Science and Technology: D Neural & Artificial Intelligence “ , Volume 19,
Issue 2 Version 1.0 Year 2019 , ISSN: 0975-4172 & Print ISSN: 0975-4350 .
5. A. B. Siddique, M. M. R. Khan, R. B. Arif, and Z. Ashrafi, "Study and Observation of the Variations of
Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Neural
Network Algorithm," , “International Conference on Electrical Engineering and Information &
Communication Technology (iCEEiCT)”, 2018, pp. 118-123: IEEE.
Certificate