0% found this document useful (0 votes)

200 views12 pages

Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model

The document summarizes a research paper on developing an accurate hand gesture recognition system using an enhanced convolutional neural network (CNN) model. Key points: 1. The researchers propose an improved deep learning based CNN model for hand gesture recognition that achieves 99.7% accuracy in recognizing real-time gestures. 2. Existing hand gesture recognition systems have limitations like complexity, high costs, need for specific hardware/sensors, and restricted backgrounds. 3. The proposed model aims to implement a lower complexity system that works on limiting computing resources like CPUs and with few restrictions. It is developed using Python with TensorFlow Keras library.

Uploaded by

Nilam Thakkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

200 views12 pages

Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model

Uploaded by

Nilam Thakkar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

DEEP LEARNING BASED ACCURATE HAND

GESTURE RECOGNITION USING ENHANCED CNN
MODEL
Hitesh Kumar Sharma1, Punit Kumar2, Prashant Ahlawat3, Nishank Parashar4, Rakesh Soni5, Yash
Manchanda6
1,4,5,6
School of Computer Science, University of Petroleum & Energy Studies (UPES), Energy Acres, Bidholi,
Dehradun- 248007, Uttarakhand, India
2, 3
Dept. of Computer Science, Manipal University Jaipur

Received: 16 March 2020 Revised and Accepted: 16 June 2020

Abstract
In 21st century Machine Learning (ML) is one of the most powerful tool for solving a variety of real-life problems.
It is useful for normal peoples and differently abled peoples as well. ML is commonly used for tasks such as regression,
prediction, detection and classification. It has capability to automate the process with the help of learning from given data.
The core idea used in ML is to use data to generate a learning model capable of producing a result for action. This result
may give a right answer with a new input or produce predictions towards the known data. With this, power of Deep
Learning a subset of ML made possible to learn human hand gesture from an image dataset and identify the actions
performed by hand in real-time. The advancement in hand gesture recognition opens the door for auto-control systems in
various sectors. Due to various complexity and noise factors in hand postures images there is still a need of improvement.
Deep Learning based CNN model helps researchers to reduce these issues and improve the image classification
mechanism.
In this research work, we have proposed a deep learning based CNN model with some improvement in basic CNN
for hand gesture recognition. The proposed model produced accurate results and recognized real-time hand gestures with
99.7% accuracy.

Keywords: Convolutional Neural Network (CNN), Differently Abled, Machine Learning, Deep Learning,
HCI, Confusion Matrix.

1. Introduction
Hand gesture recognition is a challenging filed in Human Computer Interaction (HCI) and computer vision. It is
an enabler in many sector and help to make them auto-controlled from remote places. However, the complexity in
recognizing hand gesture due to the diversity in shape, size, capturing angle, lighting conditions and background
distortion make it more challenging.. Healthcare, gaming and animation, automobile and smart classrooms, these are
some core technical sectors in which an accurate hand gesture recognition system will help to automate them and
eliminate bottlenecks for enhancing performance. One of the challenging area is to provide an easy to use system for
differently abled peoples to access various computing devices as a normal person can access. To enable differently abled
users not only interact with the modern computers but also explain their thoughts with other people. We have aimed to
propose a deep learning based model, which can be used in these core sector for better performance. They are having
limited functionalities and the hardware components are required more for these past implementation. Gloves and
sensors are additional components used to implement these complex and expensive techniques. Other limitation for
previous methods was the restriction of background. To avoid noise from captured images is to use a specific
background. The execution platform was limited to GPU and affording GPU is not possible for all users. As we have
already talked about the gap built between the use and interaction of modern technology and computers for the
differently abled people.
The core objective of this work is to implement a lower complex system with very few restrictions. It should
work on limiting computing powered resource like CPU. The solution is implemented in python using Tensorflow Keras
library.

676
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

2. Literature Review
Human Computer Interaction is an interactive filed for researchers to help humans specially differently
abled persons to access other system as a normal person do. Hand gesture recognition plays an important role
to fulfill this goal of HCI. Recognition of hand gestures plays an important role in nonverbal communication
and natural human-computer interaction (HCI). This is a very active field of research in Artificial Intelligence,
Machine Learning, and computer vision. [2]. Deep Learning has proved the performance benefits of Neural
networks like CNN used are better than traditional ANN Neural Networks. Adding some additional deep layers
in it is used for enhancing network architecture for performance benefits on image dataset like as ImageNet
Large Scale Visual Recognition Challenge ILSVRC) [4], [9]. A 3D Convolutional Neural Networks (3D-CNN)
was proposed [7] for hand gesture recognition for drivers on a VIVA dataset. It was proposed to recognize hand
gesture in complex backgrounds where illumination and other noise adding factors were available. They
proposed that 3D-CNN provides 77.5% correct classification in complex background.Six-CNN model was
proposed [3] to recognize three different classes of hand gesture 1. Close, 2. Open and 3. Unknown. Six different
architectures were implemented in which depth and hyper parameters were in variation to identify the ir
behaviors. The accuracy claimed by the authors was 73.7% for classification.Kim et al. [5] in 2018 proposed a
CNN and WFMM (Weighted Fuzzy Min-Max) based combined model for dynamic hand gesture recognition.
They claimed that this model will minimize the impact caused by temporal and spatial variation in features set.
An Adapted CNN Architecture: ADCNN was proposed [6] in 2018 for hand gesture recognition. They proposed
that the accuracy of ADCNN is better than baseline CNN and claimed an accuracy of 97.73% in comparison to
accuracy 95.73% of baseline CNN.

3. Proposed CNN model for Hand Gesture Recognition

In this section we have provided the basic architecture of CNN, details of dataset used and the architecture
of improvised CNN model.

3.1. Convolutional Neural Network (CNN)

We have developed a simple deep learning model, which is based on Convolutional Neural Network
(CNN). For simplicity, linear regression concept is used so that the model can be represent using the following
equation.

y= a*x + b ….. (1)

a and b (Slope and intercept) the two parameters that we are trying to find. On getting these two
parameters for any value of x we can predict the value of y. Same idea with some complex calculation for
multiple hidden layers is applied with CNN based model for this proposed problem of hand gesture recognition.
A ConvNet/CNN (Convolutional Neural Network) is a Deep Learning algorithm in which an image is passed as
input with some assigned weights and biases for various features of an input image. The model based on the
input images and provided parameters values the CNN will be able to differentiate one gesture from other. The
CNN is much faster and data pre-processing is least required as compared to other classification ML algorithms.

A very basic example of a simple multi-layer convolutional neural network model is shown in figure 1.
It contains Input, Output and two hidden layers. Deep hidden layers are used for extracting and learning features
of a given dataset. As we go deep in layers more features are extracted by model.CNN are the most suitable
models for image classification. CNNs apply a seriers of filters on raw pixels data of an image and extract and
understand various features. Leraning form these features helps CNNs to classify the images.
The layers designed in a CNN model is categories into three main categories:

 Convolutional Layer
 Pooling Layer
 Dense or Fully connected Layer

677
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

Figure 1 – Hidden layers in Deep Learning based CNN Model.

Convolution layers apply a specified filters also known as kernels to an image. In this layer a set of
mathematical operations are performed to produce a feature map of single value. ReLU is most popular
activation function used in this layer. Pooling layer is used to downsample the extracted image from Convolution
layer. It is use to reduce dimensions to minimize the processing time. Max-pooling is mainly used algorithm in
this layer. Dense or Fully connected layer is finally used for classification. The N-Dimension array is flatten to
1-D array. N Number of output is produce based on the number of classification required.

This uses mathematical operation for matrix multiplication.

Filter matrix is applied over Image matrix to find features in this layer. The brief description is given
below.

 An Image matrix (volume) of dimension (h x w x d)

 A filter (fh x fw x d)
 Outputs a volume dimension (h – fh +1) x (w – fw +1) x 1

3.2. Hand Gesture Recognition Dataset Specifications

For this work, we have used the Hand Gesture Recognition Database given by T. Mantecón, C.R. del
Blanco, F. Jaureguizar, N. García in their paper [1]. This dataset is available on Kaggle. It consist 20000 images
with different hand gestures of diverse peoples. 10 hand gestures of 10 different peoples is captured in this
dataset. The different peoples are considered as subject in this dataset. So there are 5 Male subjects and 5 Female
subjects taken for capturing their different hand gestures. The images were captured using the Leap Mo tion
hand tracking device.

Table 1 - Classification used for every hand gesture.

Hand Gesture Label Used
Thumb down 00
Palm (Horizontal) 01
L 02
Fist (Horizontal) 03
Fist (Vertical) 04
Thumbs up 05
Index 06
Ok 07
Palm (Vertical) 08
678
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

C 09

Due to limiting computation power we have taken 10 hand gesture image set of 2000 images of one
person in place of 10 persons. Same proposed algorithm can be used for other 09 persons if the model is trained
on high computing platform. Using 1/10 of complete dataset, 2000 images were sufficient to achieve a 97.5-
99.5% accuracy.

3.3. Conventional CNN Model Architecture

Layard configuration of a conventional CCN is given in Table 2. The conventional CNN design contains
02 Conv2D layers, Two Max-Pooling Layers, and Two Dense Layers [6]. Three dropouts layers also added in
between Cov2D and Max-Pool layers.

Table 2 Layer Configuration of Conventional CNN.[ref. 6]

Basic CNN Model Configuration

Layers Layer Shape Details
optimizer adam
loss sparse_categorical_crossentropy
metrics ['accuracy']
Conv2D Layer 32 filters, 5x5 Filter size and ReLU Activation

Max-Pooling Layer 2x2 kernel size

Dropout 20%
Conv2D Layer 32 filters, 3x3 Filter Size and ReLU Activation
Max-Pooling Layer 2x2 kernel
Dropout 20%
Flatten layer 800 Neurons
Dense Layer 128 Neurons
Dropout 20%
Dense Layer 64 Neurons
Output layer Softmax 6 classes

3.4. Proposed CNN Model Architecture

The flow diagram and layered configuration architecture is shown in Fig 2 and Table 3. In comparison
to basic CNN it has different numbers of filters and there is no dropouts factor.

Figure 2: Proposed CNN model for Hand Gesture Recognition

679
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

As shown in figure 4, images are pre-processed before feeding them into model for training purpose.
Each image is first converted from RGB color space to Grey level. Its dimensions is also decreased from
640*240 to 320*120 for fast processing. Finally pre-processed image dataset is passed to the network for feature
learning and classify images in 10 different class as per 10 classes data is provided. The model accuracy is
checked using 30% test data and found accuracy upto 99.5%.

Our CNN model is a sequential model which consist many different categories of layers those have
different functions. The first last which is close to input layer is Conv2D layer which performs 2 -D
convolutional and extract some basic features of image. 2-D convolutional operator is a mathematical operation
which is used to decide suitable weights for best feature extraction for learning. The layers early in the network
learn fewer filters while layers deeper in network learn more filters an d find more features from input image.
Tuning of filters values or the number of filters to be used is necessary for better classification. The
common practice is to have filter numbers in power of 2. Commonly 32,64,128, 256, 512 and 1024. Other
parameters required in Conv2D layer is filter size. An N*N dimensions are used for filters where N is an odd
number. Commonly used filters size are 1*1, 3*3, 5*5 or 7*7. For an image size of 128*128 a filter size of 3*3
or greater is recommended. High dimension filters like 7*7 are rarely used.
Another category of layer in model is Max Pooling layer, which is used for reducing the dimension of
image. Reduction in dimensions will increase performance by reducing processi ng time. In common as we
increase number of filters in layer, the dimensions are reduced and performance is enhanced. More features are
extracted by increasing number of filters in sequential layers.
Finally, ReLU (Rectified Linear Unit) activation function is applied on Conv2D layer output. It is linear
in nature, means it identify 0,1 as negative or positive. ReLU uses a simple mathematical function which helps
to take less time to train and test the model. Vanishing gradient problem is also eliminated using ReLU
activation which are generally prone to in other activation functions like sigmoid or tanh.
To get the best parameters values and tune the weights for accurate results, many iteration with a lot of
trail and error methods are performed.
The layered configuration is shown in following table 3.

Table 3 Layer Configuration of Proposed CNN

Basic CNN Model Configuration

Layers Layer Shape Details

optimizer adam

loss sparse_categorical_crossentropy

metrics ['accuracy']
Convolution 32 filters, 5x5 kernel and ReLU
Layer
Max-Pooling 2x2 kernel

Convolution 64 filters, 3x3 kernel and ReLU

Max-Pooling 2x2 kernel

Conv2D 64 filters, 3x3 kernel and ReLU

Layer
Fully 128 Neurons
connected
Fully 10 Neurons
connected
Output layer Softmax 10classes

680
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

There is no need for dropout because model will not suffer from overfitting. We have also one Conv2D
layer for better accuracy. Final layer (output) contains 10 classes as per our image categorization.

4. Implementation of Proposed CNN Model

In this section we have defined the implementation of our proposed model. The dataset used for learning
hand gestures is taken from kaggle [1]. The image dataset is captured using Leap motion sensor. These are a set
of infrared images. The specifications f image dataset is given in table 4:

Table 4 – LeapGestRecog dataset specification

Total Images in dataset 20000

Total object (persons) 10 (05 Male & 05
participated Female)
Total Hand Gesture Types 10 (given in table 1)
recorded
Total Classes 10

Raw Image Dimension 640*240

Total Images used for our Model 2010

No. of Images used for Training 1407

No. of Images used for Testing 603

10% images or 2000 sample images are used for training our model due to resource limitation. Some
randomly selected images are shown below in figure 3.

Figure 3: Randomly selected images from training sample

681
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

The summary of the model trained on the sampled images is given here. The layers configuration details
explained in table 3.

Model: "sequential_1"
________________________________________________________
_ Layer (type) Output Shape Param #
==============================================
= conv2d_1 (Conv2D) (None, 116, 316, 32) 832
________________________________________________________
_max_pooling2d_1 (MaxPooling2 (None, 58, 158, 32) 0
________________________________________________________
_ conv2d_2 (Conv2D) (None, 56, 156, 64) 18496
________________________________________________________
_ max_pooling2d_2 (MaxPooling2 (None, 28, 78, 64) 0
________________________________________________________
conv2d_3 (Conv2D) (None, 26, 76, 64) 36928
________________________________________________________
_ max_pooling2d_3 (MaxPooling2 (None, 13, 38, 64) 0
________________________________________________________
_ flatten_1 (Flatten) (None, 31616) 0
________________________________________________________
dense_1 (Dense) (None, 128) 4046976
________________________________________________________
_ dense_2 (Dense) (None, 10) 1290
The proposed model contains 4,104,522 parameters and using these parameters our model is able to
==============================================
produce good accuracy.
=Total params: 4,104,522
Trainable params: 4,104,522
Non-trainable
5. Experimental params: 0
Results

In this section, we are presenting the results produced by our proposed model. Following statistical
parameters are evaluated for testing the actual performance of model and used for comparison with conventional
model.

Precision: it is also known as PPV (Positive Predictive Value) as given in equation (2), it is a fraction of
positively correct predicted value to total positively predicted values.

Precision = TP / (TP+FP) ……. (2)

Recall: as given in equation (3) it is the ratio of total positive correct predictions and no. of positive class
values. It is also referred as sensitivity.

Recall= TP / (TP+FN) …… (3)

F1-score: F1-score is calculated by equation (4). As per the mathematical formula it is relation between
recall and precision. And f1-score is positively correlated with recall and precision. If recall and precision is
high than f1-score will also get high value. The value of f1 is ranging [0,1]. If f1-score is more towards 1 than
model is predicting best results of classification.

682
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

F1= 2 * [ (Precision * Recall) / (Precision + Recall)]…. (4)

Accuracy: as per the formula given in equation (5) it shows the percentage of perfect predication of
classification.

Accuracy = 100 * [(TP+TN) / (TP+TN+FP+FN)] .. (5)

Where TP= True Positive, TN= True Negative, FP= False Positive and FN = False Negative .

TP,TN,FP and FN are decided based on following matrix.

Actual Actual
Positive Negative (0)
(1)
Predicted True True
Positive (1) Positive Negative
(TP) (TN)
Predicted False False
Negative (0) Negative (FN) Positive (FP)

The confusion matrix for our proposed model is given in table 5. Vertical columns represents predicted
results and horizontal rows represents actual images.

Table 5 – Confusion matrix for Proposed CNN Model

Predicted Images (Labels are used as per table 1)
00 01 02 03 04 05 06 07 08 09
00 72 0 0 0 0 0 0 0 0 0
01 0 53 0 0 0 0 0 0 0 0
Actual Images

02 0 0 53 0 0 0 0 0 0 0
03 0 0 0 48 0 0 0 0 0 0
04 0 0 0 0 54 0 0 0 0 0
05 0 0 0 0 0 58 0 0 0 0
06 0 0 0 0 0 0 65 0 0 0
07 0 0 0 0 0 0 0 73 0 0
08 0 0 0 0 0 0 0 0 69 0
09 0 0 0 0 0 0 0 0 0 58

The performance comparison of conventional CNN and proposed CNN is given in table 6. We have
compared the values of various performance parameters as given in first column of table 6.

Table 6 – Comparison of conventional CNN and Proposed CNN

Parameters Conventional Proposed
CNN CNN
Ref. [6]
No. of 10 05
Epoch
Precision 0.96 0.98
683
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

Recall 0.96 0.99

F1 Score 0.96 0.99
Accuracy 99.73 99.95
(%)

As shown is above table 5 that the results have been improved using our proposed CNN model. The
accuracy on train and validation data with loss evaluated in shown in figure 5.

Train on 1407 samples, validate on 603 samples

Epoch 1/5 - 90s - loss: 8.8307 - accuracy: 0.7342 - val_loss: 0.0164 - val_accuracy: 0.7832

Epoch 2/5 - 88s - loss: 0.0065 - accuracy: 0.9993 - val_loss: 0.0016 - val_accuracy: 0.8235

Epoch 3/5 - 88s - loss: 0.0013 - accuracy: 0.9993 - val_loss: 0.0015 - val_accuracy: 0.9352

Epoch 4/5 - 87s - loss: 2.7636e-04 - accuracy: 1.0000 - val_loss: 0.0017 - val_accuracy: 0.9573

Epoch 5/5 - 93s - loss: 1.4641e-04 - accuracy: 1.0000 - val_loss: 0.0020 - val_accuracy: 0.9983

(a)

Proposed Model Accuracy

1.2

0.8
Accuracy

0.6

0.4

0.2

0
0 1 2 3 4 5 6
Epochs

accurracy val_Acc

(b)

684
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

Proposed Model Loss

Loss
4

0
0 1 2 3 4 5 6

-2 Epochs

(c)

Figure 4: (a) Model training output (b) Accuracy of proposed CNN on train data and validation data (c)
Loss in proposed CNN model

The results shown in figure 4 (a) and (b) are proving the improvement in conventional CNN. The accuracy
is almost 100% and loss in very closer to 0.
To test the prediction output we have given 12 images as test input to the new model and matched the
output with labeled data. The results is shown in figure 5. The output generated by trained model is plotted
using matplotlib library of python.

Figure 5: Hand Gesture prediction using Proposed CNN Model

685
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

As we can see that the title given to each image is 100% matched with the label assigned for each input
image. The prediction score for all 12 test images is 100%.

6. Application Areas

Hand gesture recognition model will help technical community to control automated machines with their
hand movement. There are many sectors where we can use these system for ease.

 Healthcare Sector
 Gaming and Animation Sector
 Automobile Sector
 Education Sector for Smart Classes
 Gesture based Mobile/ Computer accessible systems Differently abled peoples
 Defense Sector
 Etc.
In figure 6 (a) and 6 (b) we have shown some real time application of hand gesture recognition based
controlling system.

(a) (b)

Fig. 8. (a) Image depth is used for recognition of surgeon’s hand gestures. Image source [8] Background
image is from [10].

(b) Example of hand gesture recognition using Kinect gestures (Microsoft, 2019).
DOI: 10.7717/peerjcs.218/fig-5 [13]

In figure 6 (a) doctor is using his hand to control X-Ray machine installed at remote location. This
application eliminate the physical presence of doctor at every place. A doctor can control remotely installed
medical equipment from single place. It enhance the availability of a doctor at many places at the same time. It
reduce the unnecessary travelling time to reach at a location physically.
Figure 6 (b) demonstrate the use of hand recognition system in gaming and animation. The gaming
characters or gaming equipment movement can be control using hand gesture. It feels like a person is physically
feeling as a character of the game.
There are many more application areas where hand recognition system proved its success and help to
eliminates various bottleneck. It will be more applicable as long as better models are invented.

7. Conclusion

Hand Gesture based Sign language is one of the popular and needy language in many fields. It is the only
communication means for differently-abled community hampers their interaction with the computer. This
system has the potential of minimizing this communication barrier by working as a mediator between the
686
JOURNAL OF CRITICAL REVIEWS
ISSN- 2394-5125 VOL 7, ISSUE 18, 2020
computer and the person , and converting their sign language into a computer understandable language. By
achieving higher accuracy , this can be used to control the whole computer using gesture recognition model.
In this paper we have proposed some improvement in architecture of basic CNN model. We ha ve used
ReLU and Softmax activation function for effective classification. The accuracy achieved by the proposed
model is almost 100%. The confusion matrix shown in table 5 is representing that the proposed model classified
all tested images to their respected labels. Model has recognized the hand gesture passed to it in test data. It is
fast and take less processing time with respect to basic CNN.
Future work will be extended for more types of hand gesture. Different datasets will be used for more
types of gesture classifications. It will be extended to provide accurate results in various noisy gesture images
integrated by intensity fluctuations, capturing angle and other noise factors. Due to resource limitation we have
currently used less no. of images (10% of whole dataset) but in future we will implement this on high computing
cloud resources for complete dataset.

8. References

[1] T. Mantecón, C.R. del Blanco, F. Jaureguizar, N. García, “Hand Gesture Recognition using Infrared Imagery
Provided by Leap Motion Controller”, Int. Conf. on Advanced Concepts for Intelligent Vision Systems, ACIVS 2016,
Lecce, Italy, pp. 47-57, 24-27 Oct. 2016. (doi: 10.1007/978-3-319-48680-2_5).
[2] S. Mitra, T. Acharya, "Gesture Recognition: A Survey", IEEE Trans. Systems Man and Cybernetics Part C:
Applications and Rev., vol. 37, no. 3, pp. 311-324, May 2007.
[3] Kumar Sharma, H., Kshitiz, K., Shailendra “NLP and Machine Learning Techniques for Detecting Insulting
Comments on Social Networking Platforms”, ICACCE 2018.
[4] A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet classification with deep convolutional neural
networks", In NIPS, 2012..
[5] H.-J. Kim, J. S. Lee, and J.-H. Park, “Dynamic hand gesture recognition using a CNN model with 3D
receptive fields,” 2008 International Conference on Neural Networks and Signal Processing, 2008.
[6] Ali A. Alani et al,” Hand Gesture Recognition Using an Adapted Convolutional Neural Network with Data
Augmentation” 24th IEEE International Conference on Information Management. 2018.
[7] P. Molchanov, S. Gupta, K. Kim, and J. Kautz, “Hand gesture
recognition with 3D convolutional neural networks,” 2015 IEEE Conference on Computer Vision and Pattern
Recognition Workshops (CVPRW), 2015.
[8] Ebrahim Nasr-Esfahani1 et al.," Hand Gesture Recognition for Contactless Device Control in Operating
Rooms http://arxiv.org/abs/1611.04138
[9] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich,
“Going deeper with convolutions,” 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
[10].Progesoftware | Proge- Software, http://www. Proge software.it/
[11]. HK Sharma, A Shastri, R Biswas, “A Framework for Automated Database TuningUsing Dynamic SGA
Parameters and Basic Operating System Utilities”, Database System Journal, Dec 2013.
[12]. HK Sharma, A Shastri, R Biswas, “Auto-selection and management of dynamic SGA parameters in
RDBMS”, Database System Journal, Dec 2015.
[13]. Mais Yasen, Shaidah Jusoh, “A systematic review on hand gesture recognition techniques, challenges and
applications ”, PeerJ Computer Science, Sep 2019.
[14].https://www.electronicdesign.com/markets/automotive/article/21807672/automotive-gesture-
recognitionthe-next-level-in-road-safety.
[15]. Nico Zengeler et al., “Hand Gesture Recognition in Automotive Human–Machine Interaction Using Depth
Cameras”, Sensors 2019, 19(1), 59; https://doi.org/10.3390/s19010059

687

Appraisal Report
83% (6)
Appraisal Report
24 pages
Final Marketing Plan Whole
No ratings yet
Final Marketing Plan Whole
19 pages
Solution Manual of Appendix A Exercise at (CMOS VLSI Design) by NEILL and Harris
100% (1)
Solution Manual of Appendix A Exercise at (CMOS VLSI Design) by NEILL and Harris
13 pages
AS5812-54X HW Spec Programming Application V0.1 0515 2016
No ratings yet
AS5812-54X HW Spec Programming Application V0.1 0515 2016
64 pages
Chap. 8 Integrated-Circuit Logic Families: Digital IC Technology Has Advanced Rapidly
No ratings yet
Chap. 8 Integrated-Circuit Logic Families: Digital IC Technology Has Advanced Rapidly
45 pages
2 DNN-CNN-RNN
100% (1)
2 DNN-CNN-RNN
87 pages
Adaline/Madaline:Applications
100% (1)
Adaline/Madaline:Applications
25 pages
Convolution Neural Networks For Hand Gesture Recognation
No ratings yet
Convolution Neural Networks For Hand Gesture Recognation
5 pages
Neural Network Complete Notes
No ratings yet
Neural Network Complete Notes
46 pages
Various Neural Network Architect Assignment Questions
No ratings yet
Various Neural Network Architect Assignment Questions
9 pages
32 Lecture CSC462
No ratings yet
32 Lecture CSC462
34 pages
Simulation of Cdma in Labview
No ratings yet
Simulation of Cdma in Labview
4 pages
CP16036
No ratings yet
CP16036
6 pages
A2 Poster Template
No ratings yet
A2 Poster Template
1 page
Digital Fundamentals QB ODD 2021
No ratings yet
Digital Fundamentals QB ODD 2021
33 pages
Slide 2 ARM Architecture and Instruction Set
No ratings yet
Slide 2 ARM Architecture and Instruction Set
234 pages
SIGIR21 Wang Et Al Decoupled GNN
No ratings yet
SIGIR21 Wang Et Al Decoupled GNN
10 pages
VHDL Interfacing Programs
No ratings yet
VHDL Interfacing Programs
22 pages
Introduction To Soft Computing
0% (2)
Introduction To Soft Computing
19 pages
DDR3 Demo For The ECP5™ and ECP5-5G™ Versa Development Boards User Guide
No ratings yet
DDR3 Demo For The ECP5™ and ECP5-5G™ Versa Development Boards User Guide
12 pages
Robotics and Machine Vision Internal 3 Important Questions
No ratings yet
Robotics and Machine Vision Internal 3 Important Questions
1 page
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
No ratings yet
Confusion Matrix, Accuracy, Precision, Recall, F1 Score
1 page
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
No ratings yet
CS3491 - Notes - Unit 4 - Ensemble Techniques and Unsupervised Learning
35 pages
It2402 Mobile Communication
No ratings yet
It2402 Mobile Communication
1 page
Lecture 6 - State Space Search - Uninformed Search
No ratings yet
Lecture 6 - State Space Search - Uninformed Search
43 pages
Unit I Part 2 Formalisms For System Design
No ratings yet
Unit I Part 2 Formalisms For System Design
25 pages
DSP Lab Manual Final PDF
No ratings yet
DSP Lab Manual Final PDF
102 pages
Basics of Digital System Design For Beginners ISBN:978-81-957614-4-9
No ratings yet
Basics of Digital System Design For Beginners ISBN:978-81-957614-4-9
8 pages
Lab 4
No ratings yet
Lab 4
9 pages
SIH 2024 PDF Format For Presentation
No ratings yet
SIH 2024 PDF Format For Presentation
4 pages
Answers All 2007
0% (1)
Answers All 2007
64 pages
IPCV Unit 04
No ratings yet
IPCV Unit 04
12 pages
Anfis Structure
No ratings yet
Anfis Structure
5 pages
Digital Logic Design: VHDL Coding For Fpgas Unit 5
No ratings yet
Digital Logic Design: VHDL Coding For Fpgas Unit 5
34 pages
DFT Domain Image
No ratings yet
DFT Domain Image
65 pages
DLD Lab Manual - 240605 - 203418
No ratings yet
DLD Lab Manual - 240605 - 203418
49 pages
GATE 2015 Question Papers
No ratings yet
GATE 2015 Question Papers
27 pages
Computer Networks Prof. Sujoy Ghosh Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 9 Sonet/Sdh
No ratings yet
Computer Networks Prof. Sujoy Ghosh Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture - 9 Sonet/Sdh
38 pages
Unit 2a
No ratings yet
Unit 2a
31 pages
FPGA Implementation of A Face Recognition System
No ratings yet
FPGA Implementation of A Face Recognition System
5 pages
Midterm PDF
No ratings yet
Midterm PDF
2 pages
Hough Transform
No ratings yet
Hough Transform
16 pages
ANN Architecture
No ratings yet
ANN Architecture
41 pages
Function Generator Using VHDL
No ratings yet
Function Generator Using VHDL
20 pages
Design A 4×1 Multiplexer Using Pass Transistor Logic in Schematic and Simulate For Transient Characteristics.
100% (1)
Design A 4×1 Multiplexer Using Pass Transistor Logic in Schematic and Simulate For Transient Characteristics.
6 pages
CNN Lecture Notes
No ratings yet
CNN Lecture Notes
86 pages
ANN - Ch2-Adaline and Madaline
100% (1)
ANN - Ch2-Adaline and Madaline
29 pages
Perceptons Neural Networks
No ratings yet
Perceptons Neural Networks
33 pages
Pattern Recognition Unit 1,2
No ratings yet
Pattern Recognition Unit 1,2
82 pages
Systems For Digital Signal Processing: 1 - Introduction
No ratings yet
Systems For Digital Signal Processing: 1 - Introduction
21 pages
Full Adder VHDL
No ratings yet
Full Adder VHDL
52 pages
Fundamental Steps of Digital Image Processing
No ratings yet
Fundamental Steps of Digital Image Processing
30 pages
Lab File ON Digital Image Processing: Session-2017-2018 Signal Processing (M.TECH.I
No ratings yet
Lab File ON Digital Image Processing: Session-2017-2018 Signal Processing (M.TECH.I
20 pages
Complex Engineering Problem DCS 2020
No ratings yet
Complex Engineering Problem DCS 2020
1 page
Embedded Systems: Lecture Notes
No ratings yet
Embedded Systems: Lecture Notes
164 pages
Mamdani and Sugeno
No ratings yet
Mamdani and Sugeno
9 pages
Design & Train Neural Network For AND, OR Gate Using Perceptron.
0% (1)
Design & Train Neural Network For AND, OR Gate Using Perceptron.
5 pages
Fuzzy Logic Seminar
No ratings yet
Fuzzy Logic Seminar
22 pages
Kochar Inderkumar Asst. Professor MPSTME, Mumbai
No ratings yet
Kochar Inderkumar Asst. Professor MPSTME, Mumbai
66 pages
Applicationof Deep Learningusing Convolutional Neural
No ratings yet
Applicationof Deep Learningusing Convolutional Neural
8 pages
U18Ini5600 - Engineering Cilincs - V Project Report
No ratings yet
U18Ini5600 - Engineering Cilincs - V Project Report
14 pages
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
Semester V (2022-25)
No ratings yet
Semester V (2022-25)
1 page
Giancoli Chap 3 Vectors Kinematics in 2 Dimensions
No ratings yet
Giancoli Chap 3 Vectors Kinematics in 2 Dimensions
37 pages
Status and Prospects For Helicopter Apus in Russia Gavrilov V.V., Ponomarev B.A
No ratings yet
Status and Prospects For Helicopter Apus in Russia Gavrilov V.V., Ponomarev B.A
16 pages
Royal Park Property Development Limited
No ratings yet
Royal Park Property Development Limited
7 pages
Modul Session 12 Akuntasi Feb
No ratings yet
Modul Session 12 Akuntasi Feb
26 pages
Blown Film
0% (1)
Blown Film
4 pages
Climate Change
No ratings yet
Climate Change
5 pages
Dr. Richard Felder and Dr. Rebecca Brent Part 3
No ratings yet
Dr. Richard Felder and Dr. Rebecca Brent Part 3
3 pages
Incredible India Quiz Prelims: Avant-Garde Essence 2018
No ratings yet
Incredible India Quiz Prelims: Avant-Garde Essence 2018
86 pages
ERQ Marked Samples Discuss The Role That One Cultural Dimension May Have On Behaviour.
No ratings yet
ERQ Marked Samples Discuss The Role That One Cultural Dimension May Have On Behaviour.
4 pages
5 6089131777291453670
100% (1)
5 6089131777291453670
70 pages
What Is Capacity Planning
No ratings yet
What Is Capacity Planning
6 pages
Pas 7 - Statement of Cash Flows
No ratings yet
Pas 7 - Statement of Cash Flows
8 pages
Chapter 3 Abstract
No ratings yet
Chapter 3 Abstract
19 pages
Silent 50HZ 128KW 160kva Cummins 6cta8.3 G1 Engine Diesel Generator Set
No ratings yet
Silent 50HZ 128KW 160kva Cummins 6cta8.3 G1 Engine Diesel Generator Set
4 pages
NRC, Logistics Officer, Cover Letter & CV, Elhamfrotan.
No ratings yet
NRC, Logistics Officer, Cover Letter & CV, Elhamfrotan.
4 pages
Cimplicity 1
No ratings yet
Cimplicity 1
119 pages
GSTR1 Excel Workbook Template V1.4
No ratings yet
GSTR1 Excel Workbook Template V1.4
84 pages
Fis2014 Fish Biology Individual Assignment (50 Marks)
No ratings yet
Fis2014 Fish Biology Individual Assignment (50 Marks)
2 pages
Sabino 2017
No ratings yet
Sabino 2017
15 pages
Ankita Saikia Resume
No ratings yet
Ankita Saikia Resume
2 pages
Project: K028-Fayyhealth-Polyclinic Risk Assessment For Slab Coring Work
No ratings yet
Project: K028-Fayyhealth-Polyclinic Risk Assessment For Slab Coring Work
5 pages
Probability Althea
No ratings yet
Probability Althea
8 pages
ĐỀ THI THỬ SỐ 10 - Khóa Đề
No ratings yet
ĐỀ THI THỬ SỐ 10 - Khóa Đề
6 pages
FSSAI - Internship Portal
No ratings yet
FSSAI - Internship Portal
3 pages
Epic Failures in DevSecOps V1
No ratings yet
Epic Failures in DevSecOps V1
156 pages
Digital Electronics and Communication Systems: Curriculum
No ratings yet
Digital Electronics and Communication Systems: Curriculum
83 pages
Railway Traning Report
100% (2)
Railway Traning Report
45 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model

Uploaded by

Deep Learning Based Accurate Hand Gesture Recognition Using Enhanced CNN Model

Uploaded by

JOURNAL OF CRITICAL REVIEWS

ISSN- 2394-5125 VOL 7, ISSUE 18, 2020

DEEP LEARNING BASED ACCURATE HAND

Received: 16 March 2020 Revised and Accepted: 16 June 2020

3. Proposed CNN model for Hand Gesture Recognition

3.1. Convolutional Neural Network (CNN)

y= a*x + b ….. (1)

Figure 1 – Hidden layers in Deep Learning based CNN Model.

This uses mathematical operation for matrix multiplication.

 An Image matrix (volume) of dimension (h x w x d)

3.2. Hand Gesture Recognition Dataset Specifications

Table 1 - Classification used for every hand gesture.

3.3. Conventional CNN Model Architecture

Table 2 Layer Configuration of Conventional CNN.[ref. 6]

Basic CNN Model Configuration

Max-Pooling Layer 2x2 kernel size

3.4. Proposed CNN Model Architecture

Figure 2: Proposed CNN model for Hand Gesture Recognition

Table 3 Layer Configuration of Proposed CNN

Basic CNN Model Configuration

Layers Layer Shape Details

Convolution 64 filters, 3x3 kernel and ReLU

Conv2D 64 filters, 3x3 kernel and ReLU

4. Implementation of Proposed CNN Model

Table 4 – LeapGestRecog dataset specification

Total Images in dataset 20000

Raw Image Dimension 640*240

No. of Images used for Training 1407

No. of Images used for Testing 603

Figure 3: Randomly selected images from training sample

Precision = TP / (TP+FP) ……. (2)

Recall= TP / (TP+FN) …… (3)

F1= 2 * [ (Precision * Recall) / (Precision + Recall)]…. (4)

Accuracy = 100 * [(TP+TN) / (TP+TN+FP+FN)] .. (5)

TP,TN,FP and FN are decided based on following matrix.

Table 5 – Confusion matrix for Proposed CNN Model

Table 6 – Comparison of conventional CNN and Proposed CNN

Recall 0.96 0.99

Train on 1407 samples, validate on 603 samples

Proposed Model Accuracy

Proposed Model Loss

Figure 5: Hand Gesture prediction using Proposed CNN Model

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.