0% found this document useful (0 votes)
391 views26 pages

Mnist Handwritten Digit Classification

This document presents a project on classifying handwritten digits from the MNIST dataset using convolutional neural networks. The presentation includes sections on introduction to machine learning and deep learning, convolutional neural networks, motivation for using the MNIST dataset, literature review of previous approaches, methodology, and conclusion. It will be presented by Raina Srivastava under the guidance of Rahul Chakravorty at the Institute of Technology and Management, Gida, Gorakhpur on November 1st, 2020.

Uploaded by

Raina Srivastav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
391 views26 pages

Mnist Handwritten Digit Classification

This document presents a project on classifying handwritten digits from the MNIST dataset using convolutional neural networks. The presentation includes sections on introduction to machine learning and deep learning, convolutional neural networks, motivation for using the MNIST dataset, literature review of previous approaches, methodology, and conclusion. It will be presented by Raina Srivastava under the guidance of Rahul Chakravorty at the Institute of Technology and Management, Gida, Gorakhpur on November 1st, 2020.

Uploaded by

Raina Srivastav
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26

A Presentation on

Project Title
MNIST Handwritten Digit Classification using
Convolution Neural Networks
Presented by.
Raina Srivastava
(1712010072)
Under the guidance of
Rahul Chakravorty(Assistant Professor)
Department of Computer Science and Engineering
Institute of Technology and Management, Gida, Gorakhpur
Date-: 01/11/2020

Dr. A.P.J. Abdul Kalam Technical University, Uttar Pradesh, Lucknow


Content
1. Introduction
2. Goals
3. Motivation
4. Various approaches
5. Literature review
6. Identification of research gap and
problem
7. Methodology
8. Conclusion
9. References
Introduction
• Machine Learning is the field of study that gives computers the capability to
learn without being explicitly programmed. ML is one of the most exciting
technologies that one would have ever come across. Machine learning is
actively being used today, perhaps in many more places than one would
expect.

• Deep learning is an artificial intelligence function that imitates the workings


of the human brain in processing data and creating patterns for use in
decision making. Deep learning is a subset of machine learning in artificial
intelligence that has networks capable of learning unsupervised from data
that is unstructured or unlabeled. Also known as deep neural learning or
deep neural network.
Convolution Neural Networks
• Convolution Neural Network is a deep learning method which is broadly used for image
classification, image recognition, object detection etc.
• In this project, we will see how CNN’s can be used for image classification. For this, the
model takes an image as input, processes it and classifies it under a certain category. An
image is a collection of pixels, with features which specifies the height, width and the
dimensions of the image.
• For example – and image with 6x6x3 dimensions signifies that it is an RGB (red, green,
blue) colour image while an image with dimensions 6x6x1 signifies a grayscale image.
Convolution Neural Networks
• In the figure, we have an 4x4x3 RGB image which has
been separated by its three color planes — Red, Green,
and Blue. The height and width of the image is 4. In a
grayscale image, the color channels will be reduced to
1.
• There are a
number of such color spaces for images such as
Grayscale, RGB, HSV, CMYK, etc.

• In the CNN model, the training and testing of the input image is performed by passing the
image through a series of convolution layers with filters. These filters are also known as
kernels.

• This is followed by the pooling layers for feature extraction, the dense layers and finally
‘softmax’ function is used for classifying the object within a probabilistic range values of 0
to 1.
Convolution Neural Networks
• Convolution Layer
 The convolution layer is the first layer of the CNN model architecture which starts to
extract features from the input. The convolution layer operates with two inputs, one is
the image matrix while the other being the filter or kernel.
 The convolution layer safeguards the relationship between the original image pixels
by learning the features of the image using small square patches.

 In the figure, we have an image matrix of size 5x5 and a kernel size of
3x3. The image values are 0 and 1. Therefore to obtain a ‘feature map’ we multiply the
image matrix with the filter.
Convolution Neural Networks
 Full padding – When padding of an image is done to increase the dimensions of the
output, which is why it is less used in convolution neural networks.
 Same padding – it is done when the output size is required to be kept equal to the input
vector. It is quite common in terms of application.
 Valid padding – it refers to no padding, and produces the output in decreased dimensions.
Convolution Neural Networks
 After running the entire set of iterations for a specified
number of epochs, the model becomes able in
distinguishing between the important and the trivial
features of an image. In the end since this is multiclass
classification problem, we
apply a “Softmax” function to convert the vector values into a probability distribution
consisting of the same number of probabilities as the elements in the vector values into a
probability distribution consisting of the same number of probabilities as the elements in the
vector.

Architecture of a CNN model


Motivation
□ The Modified National Institute of Standards and Technology (MNIST)
dataset is a large set of 70,000 images of handwritten digits. This dataset
was created by modifying the original NIST’s dataset. The original NIST’s
dataset contained digits handwritten by American Census Bureau
employees for training while the testing set contained digits written by High
School students, this made the dataset inappropriate for Machine Learning.
So the dataset was modified such that the training and testing set contained
digits written by Census Bureau employees and High School students.
Literature Review
Author Name Description Parameters
Andrea Giuliodori(2011) The purpose of this paper is to • Binarization of database
present alternative classification • Construction of variablefor
methods based on statistical classification
techniques. We show a comparison • Probabilistic Approach
between a multivariate and a
probabilistic approach, concluding
that both methods provide similar
results in terms of test-error rate.

Ming Wu(2010) The goal is to train and test a set of • Clsssificaton of MNIST database for
classifiers for pattern analysis in pattern analysis in solving the
solving handwritten digit handwritten digit recognition
recognition problems, using MNIST problem
database. Direction features are
extracted for dimensionality
reduction.
Continue...
Author Name Description Parameter
Wan Zhu(2012) In this paper the experimental results • Backpropagation neural network
show that to a certain extent, the back- setting up
propagation neural network can be used • Evaluation Methods
to solve classification problem in the real • Modify original method with
world. However,the accuracy rate cannot autoencoder
be guaranteed and potential reasons are
discussed. Then we try to modify the
structure of the original neural network
into a CNN.

Fathma Siddique(2013) The goal of this paper work will be to • Modeling of Convolutional Neural
create a model that will be able to Network to classify HANDWRITTEN
identify and determine the handwritten DIGITS
digit from its image with better accuracy.
We aim to complete this by using the
concepts of CNN and MNIST
dataset.Through this work, the aim to
learn and practically apply the concepts
of CNN.
Continue...
Author Name Description Parameters
Md. Anwar Hossain (2019) The goal of this paper work will be • The Architecture of the
to create a model that will be able proposed model
to identify and determine the
handwritten digit from its image • Explanation of the Model
with better accuracy. We aim to
complete this by using the
concepts of CNN and MNIST
dataset. Through this work, the
aim to learn and practically apply
the concepts of Convolutional
Neural Networks.
Methodology
• We have prepared the handwritten digit recognition model through Convolution Neural Networks, using
‘Tensorflow” module of version 2.3.0 and all the pre-processing was performed through ‘Keras’ version
2.4.3.

• At first, we have imported all the necessary modules that we require to build the model. Keras provides us
with many predefined datasets, among which, the most important part is to import the dataset which we
have used in our model i.e., the “mnist” dataset of the handwritten digits.

import cv2
import numpy as np
from keras.datasets import mnist
from keras.layers import Dense, Flatten
from keras.layers.convolutional import Conv2D
from keras.models import Sequential
from keras.utils import to_categorical
from keras.layers import Dense, Dropout, Flatten
Methodology
• We segregate the imported dataset into training and test datasets for
evaluation.
(X_train, y_train), (X_test, y_test) = mnist.load_data()

• The dataset consists of a total number of 70000 images. Out of which, 60000 images were subjected for
training of the model and the rest 10000 images were kept for the test dataset.
• Then we display random images (between 0 to 9) from the dataset to have an idea of how the images
look. We use the following code snippet for the purpose
plt.imshow(X_train[x], cmap="gray")
plt.show()
• As we can make out, the images are of the digits ‘5’ and ‘2’
and the print (y_train[x]) clearly gives the original values.
Methodology
• Next, we need to make sure that we shape our input datasets so that there is complete
uniformity before training of the model. We have already set the number of images in the
training set to 60000 and to test set to 10000.
• We can also see the dimension of each image as 28x28x1 which signifies that the images
are square with same height and width and the 1 specifies that they are grayscale images.

X_train = X_train.reshape(60000, 28, 28, 1)


X_test = X_test.reshape(10000, 28, 28, 1)

• One Hot coding – the next task is to convert the input data into categorical format so that
it is easier to classify the digits. The final layer (dense layer) of the CNN model consists of
10 nodes, one for each of the digits ranging from 0 to 9. when a model is fed with an
image, it returns the probabilities of the digits according to the nodes.
Methodology
• The predicted output which we will be returned to the user is the node which will have
the highest probability. For example - if the first node has the highest probability then the
output will be ‘0’, similarly if the last node has the highest probability then the output will
be ‘1’.

• For the labels of one-hot encoding, since we have a total of ten possible outputs, the
model will allot ‘1’ for the node with the highest probability and the rest as ‘0’. For
example- if we have a predicted output of 3 then the labels after one hot encoding will
produce the output as [0,0,0,1,0,0,0,0,0,0].

• To implement this, we have used the code snippet from keras.utils import
to_categorical

y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
Methodology
• The Keras model type which we have used to build the model is ‘Sequential’. It helps to
build the model, one layer after the other, by using the add() function for adding the layers
one by one. ‘Sequential’ can be imported by using
from keras.models import Sequential

• The first two layers of the model are the convolution layers. Since the images to be dealt
are in form of 2D matrices, the layers are ‘Conv2D’ layers.
• In the first layer, there are 32 nodes and the second layer has 64 nodes, chosen on the
basis of the model. The size of the kernel used is being set to 3, which means a 3x3 filter
matrix will be used for the convolution.
• The activation function which has been used is ReLU. This function returns a 0 if the input
is negative or returns the same values i.e., ReLU -> max(0, input).
Methodology
• There is an activation function in the dense layer called “Softmax”, which is used for
getting the outputs in the form of probabilities ranging from 0 to
1. Therefore, while producing an output the algorithm tries to match the maximum
probability with the input data and consequently provides the prediction.

• Then we compile the models together using the parameters “optimizer” set to “adam” as
it is a very good optimiser as it combines features from both Stochastic gradient descent
and RMSProp.

• Loss function has been taken as “categorical_crossentropy” which is one of the widely
used functions for classification problem. Metrics has been chosen as “accuracy” to check
on the validation accuracy during training of the model.

model.compile(optimizer='adam',loss='categorical_crossentropy',
metrics=['accuracy'])
Methodology
• Next comes the training of the model with all the characteristics that we have specified.
The model will be trained on the X_train, y_train data frames and will be validated on the
X_test and y_test data frames.

• After multiple experimentation on the model, the optimum number of epochs was found
to be 15 which has been specified in the model.fit() parameter.

model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=15)

• By changing the dropout layer values, we needed to be sure that the model is not
overfitted. A round of 60 epochs with no dropout showed that the model was highly
overfitted.

• Therefore both the dropout layers were set to 0.15 and the results were
taken after 15 epochs.
Results
• The final validation accuracy which we have gained is 0.9894 i.e., 98.94%. Now we will
predict the images and see how well out model is performing.
• We have used a combination of three outputs together. In the first case, we are printing all
the probabilities which the program has calculated . We had ten different classes of letters,
and the one which has the highest probability should be the correct prediction.
• In the second case, the probabilities were converted again. The value of highest
probability has been set as “1” while the other probabilities set to “1”. And in the third
output, the corresponding image has been shown.
Results
We take single handwritten digits to check how well our model is performing. The images
were hand drawn on Microsoft Paint and saved in PNG format. The image is of a “three”
number.

Original Input Contour identification

• “Softmax” probability values of the image, where we can see


corresponding probability value of “3” is the maximum.
Results
• Now we take multiple inputs, the new image consists of 4 digits
(handwritten digits) .
• We took the sample picture in JPEG format with the digits 0,5,2,2.

Original Input
Conclusion
• The MNIST dataset is a widely used dataset for digital recognition in the field of data
science. We have used this dataset to train a deep learning model and recognize the ten
different digits of number system.

• The dataset consists of 70000 images, each of size 28x28 pixels. These images were then
converted to grayscale images and the new dimensions were 28x28x1 which were used as
sample data.

• The deep learning model which has been used is Convolution Neural Networks. We have
used a kernel size of 3 with MaxPooling of size 2x2 for extracting of the features. We have
used “softmax” function in the dense layer of the model to get the probabilities of the
predicted output.

• We have used hand-written digits of two types for testing of the model. The initial
examples are of digitalized hand written images prepared in Microsoft paint and then we
have used real hand written image. Both the times, the results were accurate and have
predicted correctly.
References
1. Andrea Giuliodori, “HANDWRITTEN DIGIT CLASSIFICATION “, “Departamento de Estadística
Universidad Carlos III de Madrid Calle Madrid”, June 2011 , Working Paper: 11-17 .

2. Hiroshi Sako Hiromichi Fujisawa Cheng-Lin Liu, Kazuki Nakashima, “Handwritten Digit Recognition:
benchmarking of state-of-the-art techniques”, “The journal of the pattern recognition society”,
36:2271–2285, 2003.

3. Wan Zhu, “Classification of MNIST Handwritten Digit Database using Neural Network “, Australian
National University, Acton, ACT 2601, Australia u6148896@anu.edu.au .

4. Md. Anwar Hossain, “Recognition of Handwritten Digit using Convolutional Neural Network (CNN) “,
“Global Journal of Computer Science and Technology: D Neural & Artificial Intelligence “ , Volume 19,
Issue 2 Version 1.0 Year 2019 , ISSN: 0975-4172 & Print ISSN: 0975-4350 .

5. A. B. Siddique, M. M. R. Khan, R. B. Arif, and Z. Ashrafi, "Study and Observation of the Variations of
Accuracies for Handwritten Digits Recognition with Various Hidden Layers and Epochs using Neural
Network Algorithm," , “International Conference on Electrical Engineering and Information &
Communication Technology (iCEEiCT)”, 2018, pp. 118-123: IEEE.
Certificate

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy