0% found this document useful (0 votes)

23 views7 pages

A New Benchmark On American Sign Language Recognition Using Convolutional Neural Network

Uploaded by

baharmoghbeli1400

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views7 pages

A New Benchmark On American Sign Language Recognition Using Convolutional Neural Network

Uploaded by

baharmoghbeli1400

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/340686365

A New Benchmark on American Sign Language Recognition using Convolutional

Neural Network

Conference Paper · April 2020

DOI: 10.1109/STI47673.2019.9067974

CITATIONS READS
25 6,194

6 authors, including:

Md Moklesur Rahman Md Shafiqul Islam

University of Milan The People's University of Bangladesh
10 PUBLICATIONS 127 CITATIONS 7 PUBLICATIONS 124 CITATIONS

SEE PROFILE SEE PROFILE

Md. Hafizur Rahman Roberto Sassi

University of Maine University of Milan
10 PUBLICATIONS 106 CITATIONS 175 PUBLICATIONS 3,534 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Md Moklesur Rahman on 17 April 2020.

The user has requested enhancement of the downloaded file.

2019 International Conference on Sustainable Technologies for
Industry 4.0 (STI), 24-25 December, Dhaka

A New Benchmark on American Sign Language

Recognition using Convolutional Neural Network
Md. Moklesur Rahman∗ , Md. Shafiqul Islam† , Md. Hafizur Rahman‡ , Roberto Sassi§ , Massimo W. Rivolta¶ and
Md Aktaruzzamank
∗ , † Dept. of Computer Science and Eng., The People’s University of Bangladesh, Dhaka, Bangladesh.
‡ Dept. of Electrical and Electronic Eng., Islamic University, Kushtia, Bangladesh.
§ , ¶ Dipartimento di Informatica, Università degli Studi di Milano, Via Celoria 18, 20133, Milano, Italy.
k Dept. of Computer Science and Eng., Islamic University, Kushtia, Bangladesh.

{∗ moklesur.ai, † msislam.iu, ‡ hafizur.iueee}@gmail.com, {§ roberto.sassi, ¶ massimo.rivolta}@unimi.it and

k aktaruzzaman@iu.ac.bd

Abstract—The listening or hearing impaired (deaf/dumb) peo- mutually intelligible and universal. For example, the ASL and
ple use a set of signs, called sign language instead of speech for BSL are different, even though both of them use the same
communication among them. However, it is very challenging for verbal language. The normal hearing and listening people
non-sign language speakers to communicate with this community
using signs. It is very necessary to develop an application to find it extremely difficult to understand the sign language
recognize gestures or actions of sign languages to make easy even of the nation itself. Hence trained SL interpreters are
communication between the normal and the deaf community. needed during medical and legal appointments, educational
The American Sign Language (ASL) is one of the mostly used and training sessions, etc. The automatic recognition of an
sign languages in the World, and considering its importance, SL and its translation into a natural language can establish
there are already existing methods for recognition of ASL with
limited accuracy. The objective of this study is to propose a a proper communication interface between the hearing or
novel model to enhance the accuracy of the existing methods for listening impaired and normal people.
ASL recognition. The study has been performed on the alphabet ASL also predominants as a second language to the deaf
and numerals of four publicly available ASL datasets. After communities in the United States and Canada [2]. According
preprocessing, the images of the alphabet and numerals were fed to the National Association of Deaf (NAD) [3] in the United
to a newly proposed convolutional neural network (CNN) model,
and the performance of this model was evaluated to recognize States of America, ASL is accepted by many high schools,
the numerals and alphabet of these datasets. The proposed CNN colleges, and universities in the fulfillment of modern and for-
model significantly (9%) improves the recognition accuracy of eign language academic degree requirements. Besides North
ASL reported by some existing prominent methods. America, ASL is also used in many countries across the World,
Index Terms—Hand gesture, American Sign Language, Con- including parts of Southeast Asia and much of West Africa.
volution neural network, Recognition, ASL.
There are some works [4]–[6] already reported in the litera-
ture for automatic recognition of ASL. Some of these methods
I. I NTRODUCTION
have been studied on a sample dataset of few samples, and
According to the World Health Organization (WHO) [1], some using the traditional shallow neural network approach
the number of people having hearing or listening disability for classification. Shallow neural networks require manual
increased from 278 million in 2005 to 466 million in early identification of features and relevant features selection. The
2018. It is assumed that this number will be increased to use of deep learning (DL) techniques for machine learn-
400 million by 2050 [1]. This deaf community uses a set of ing problems have significantly improved the performance
signs to express their language (called sign language), which is of traditional shallow neural networks, especially for image
different for different nations. In other words, a sign language recognition and computer vision problems. DL is a subfield
(SL) is a nonverbal communication language, which utilizes of machine learning in artificial intelligence (AI). It is a set
visual sign patterns made with the hands or any parts of the of algorithms, models with high-level abstractions through
body, used primarily by the people who have the disability architectures which composed of multiple nonlinear transfor-
of hearing and/or listening. Sign languages (SLs) are full- mations. DL algorithms utilize a huge amount of data to extract
fledged natural languages with their own lexicon and grammar. features automatically, aim in emulating the human brain’s
Different SLs such as American Sign Language (ASL), Aus- ability to learn, analyze, observe, and make an inference,
tralian Sign Language, British Sign Language (BSL), Danish especially for extremely difficult problems. DL architectures
Sign Language, French Sign Language, and many others create relationships beyond immediate neighbors in the data
have been developed for deaf communities. Although there and generate learning patterns, extract representations directly
are some striking similarities among the SLs, they are not from data without human intervention. There are different deep

learning architectures such as deep belief networks [7], stacked Many researchers [4], [5], [11]–[13] have proposed tech-
auto encoder [8], convolutional neural networks [9], and so on. niques to recognize sign languages since then. Every research
Among them, the CNN have utilized multi-layered artificial has its own limitations and is still unable to be used commer-
neural network (ANN) to provide state-of-the-art accuracy in cially. The brief description of some prominent works on ASL
the field of computer vision, medical image analysis, speech recognition is given here:
recognition, bioinformatics and so on. Vivek Bheda et al. [4] presented a method for the classifica-
A convolutional neural network (CNN), one of the most tion of alphanumeric characters of the ASL. Two datasets (one
popular deep learning algorithms, comprising of convolutional self-generated and MU HandImages ASL [6] were used in
layers and then following by one or more fully connected their study. They reported an average recognition rate of 67%
layers, was proposed by Vallian et al. [10]. From a computer and 82.5%, respectively for the alphabet and the digits of ASL.
science perspective, a CNN is a set of digital filters whose W. Huang et al. [11] propose 3D Hopfield neural network
weights are estimated during the learning phase. Naturally, for hand tracking, feature extraction, and gesture recognition.
there are more complex processes occurring in human brain. Their model was tested on a set of 15 different hand gestures
Following this analogy, each convolutional layer extracts fea- and reported an average recognition rate of 91%. T. Starner et
tures from training data. A CNN convolves learned features al. [14] presented real-time systems for recognizing sentence-
with input data, utilizes convolutional layers, and turning this level continuous ASL. They described two extensible systems
architecture into well suited form to process data. CNNs learn for recognition of words in which the first system achieve 92%
to detect different features of data using multiple hidden layers. accuracy when images are taken from a desk mounted camera
Every hidden layer increases the difficulty of the learned data and the second system obtains 98% recognition rate from cap
features. The problems with the existing methods for ASL mounted camera by the user. In both tests, they have used a
recognition are that they have reported their study on a specific 40-word lexicon.
dataset (most of the cases), rare comparison of the methods Suk et al. [12] propose a dynamic Bayesian network (DBN)
on a common dataset. So, we cannot compare in which degree model for classifying hand sign from video stream. Skin
a method is better/worse than another or which dataset does extraction, modeling, and motion tracking are observe by the
possess proper variation in samples for sufficient training of a DBN model. The model is evaluated on the recognition of ASL
classification model. by 10 isolated gestures and performed over 99% recognition
To address these problems, we have considered four publicly rate. In this work, all gestures are noticeably different from
available ASL datasets on which a number of good works each other. However, the motion tracking features are related
have been reported. We have studied the performance of the to classifying the dynamic letters of ASL (J and Z).
proposed model on each dataset, when trained using the train-
ing set and tested using the test set (if separate train and test
sets are available, or using the 10 fold cross-validation). The
performance of the proposed method has also been justified
using cross dataset i.e., trained using one dataset and tested
on a different dataset. In addition, the performance has been
compared with previous methods on the same dataset.
The rest of this paper has been organized as follows: A
brief description of related works reported in the literature
has been given in section II. A brief description of the dataset
considered in this study has been provided in section III. The
proposed method has been described in section IV. The results
of this study have been presented in section VI, and finally
section VII summarizes the study and main findings.

II. R ELATED W ORKS

The researchers are paying more and more attention to
recognition of sign language due to its numerous potential
applications in many areas such as deaf people communication
systems, human-machine interaction, machine control, etc.
Research on SL recognition can be divided into two broad
categories on the basis of the type of signs: i. static signs based
recognition and ii. dynamic signs based recognition. Majority
of the studies till conducted are recognition of static signs. Fig. 1: Some samples of ASL Alphabet and Sign Language
The research on SL recognition has been started by the end Digits dataset.
of 1990.
III. DATASET A. Data Pre-processing

In this paper, four separate datasets of ASL have been In this study, the raw images are transformed into grayscale
utilized to analyze the performance of the proposed method. images. The grays levels of input images are normalized by
The Massey University Gesture dataset [6] contains standard the maximum value of the gray level range. The use of low-
ASL hand gestures which consist of 2425 images in (PNG) resolution images provides faster training without too much
format from 5 individuals and the dataset is called MU impact on the recognition rate. The images are resized to
HandImages ASL. The sign language digit dataset [15] was 64×64 pixels.
collected from 218 Turkey Ankara Ayranci Anadolu high
B. CNN Model Description
school students. There are 10 samples of each digit collected
from each subject. The third dataset that we considered in The architectural design of a CNN contributes to the optimal
this study is the ASL finger spelling dataset [5] collected by performance by a proper selection of convolution layers and
the Center for Vision, Speech and Signal Processing group the number of neurons. There are no universally accepted
at the University of Surrey, UK. The samples of this dataset standard guidelines to select the number of neurons and
was divided into color images and depth images. In our work, convolution layers. Here, we have proposed an architecture
we consider only color images, consist of 24 static signs of a CNN (that we called SLRNet-8) which maximizes the
(excluding the letters J and Z) of the ASL alphabet, which recognition accuracy. Our proposed SLRNet-8 consists of six
were acquired from 5 individuals in different sessions with convolution layers, three pooling layers and a fully connected
similar lighting and backgrounds. The dataset contains over layer besides the input-output layers. The major steps of the
65,000 images of the ASL alphabet. The fourth and the last proposed ASL recognition method have been described in the
dataset that we considered is the ASL Alphabet dataset [16] Fig. 3.
consists of 87,000 samples of 29 classes (26 for the letters
A-Z and three special characters: delete, nothing, space). A
few samples from this dataset are shown in Fig. 1.

IV. P ROPOSED M ETHOD

The methodological steps of the ASL recognition system

have been described in Fig. 2.

Fig. 3: Architecture of the proposed SLRNet-8 CNN. In this

diagram, Conv and BN refer for convolutional layer and batch
normalization, respectively.

Input Layer
The pre-processed images are directly applied to the net-
work through its input layer. There are 4096 nodes in the input
layer, each node corresponds to every pixel of the image at
resolution 64×64.
Convolution Layer
Convolution is the first layer to extract features from an
input data and serves as a basic building block of the CNN.
In the convolution layer, the kernels extract the salient features
from the input data through forward and backward propaga-
Fig. 2: Methodological steps of the proposed ASL recognition tion. In our study, this operation is performed by shifting
system the filters of dimension 3×3 and 5×5 over the input data
matrix. At every shifting, it executes element-wise matrix
multiplications, and then aggregates the results into a feature Dropout is a regularization technique that sets input ele-
map. ments to zero with a given probability in a random manner.
The number of kernels used in the convolutional layers The over-fitting problem occurs when a model’s training
may affect the performance of a CNN model. There are no accuracy is too high in contrast to testing accuracy. In CNN
standard guidelines for opting the number of kernels in a models, a dropout layer followed by the FC layer allows to
convolution layer. In this work, we have performed experiment prevent the over-fitting problem and enhances the performance
using different number of kernels from 32 to 512 at different [20], [21] by setting activation to zero in a random manner
step size, and finally, the combination which maximize the during the training process. The probability of dropout used
accuracy was selected. A batch normalization (BN) [17] layer in this study was 0.5.
which is responsible for accelerating the training process and Output Layer
reducing the internal covariate shift is proceeded by some The output of the classification model i.e., the prediction
convolution layer. of a class with a certain probability is obtained at this layer.
Activation Function Here it is also mentioned that the target class should have
In a CNN architecture, activation functions decide which the highest probability. We set out the number of neurons
node should be fired at a time. We have applied a ReLU [18] in the output layer as there are categories. In the case of a
activation function which substitutes all negative values to 0 multiclass classification problem, the Softmax function returns
and remains identical with the positive values. The selection the probabilities of each class, where the target class will have
of ReLU was inspired by the learning time of the model. In the highest probability. The mathematical expression for the
training, ReLUs are tended to be several times faster [19] than Softmax function is given by:
their equivalents (softplus, tanh, sigmoid), and it can diminish exj
the problem of gradient vanishing. The ReLU function is σ(X)i = PK ; for i = 1, 2, 3, ...K. (3)
k=1 exk
expressed by:
ReLU(y) = max(0, y), (1) where, xi are the inputs from the previous FC layer used
to each Softmax layer node and K is the number of classes.
where y refers to the input to a neuron.
Pooling Layer V. T RAINING D ETAILS
Pooling is a significant concept used in deep learning A. Data Augmentation
process. It makes the training of a CNN faster and reduces Data augmentation means increasing the number of samples
the memory size of the network by reducing the linkages as well as adding the variations in the samples for better
between the convolutional layers. Here, we have used max- training. The traditional data augmentation techniques [22],
pooling operation for this purposes. Max-pooling is the usage [23] include rotation, scaling, shifting and flipping. To keep
of a sliding window across an input space, where the largest smaller the computational burden, here, data augmentation are
value within that window is the output. We have opted 2×2 performed by randomly changing the angles between −10° to
window size for the max-pooling operation. To remove the 10°, zooming by 10%, and shifting by 10% on height and
overlapping problem, the size of stride has been settled at 2. width. These parameters were chosen by trial and error basis
The resultant dimension of the max-pooling operation can be which provided optimum accuracy.
calculated by the following equation:
B. CNN Training
Nin − F
Nout = f loor + 1, (2) There was no distinct test or train sets for any of the
S
datasets considered in this work. In this study, every dataset
where Nin , F and S refer to the size of the input image, was randomly partitioned into K=10 folds, and K − 1 of them
kernel, and stride, respectively. were applied for training the model, and the remaining one
Global Average Pooling Layer was applied for testing the performance of it. This process of
The global average pooling (GAP) layer is very similar to training was repeated 10 times, and finally, the average of all
the max-pooling layer. The only difference is that the entire accuracies was reported as the accuracy of the model. Here,
area is replaced by the average value instead of maximum we choose cross-entropy cost function [24], and a gradient
value. The GAP extremely reduces dimension, where a ten- descent-based Adam optimizer [25] with a learning rate 0.001
sor of size height×width×depth is drastically decreased to was selected. Our SLRNet-8 model was trained for up to 200
1×1×depth. epochs with 64 steps per epoch, and a batch size of 128. If
Fully Connected Layer the validation accuracy did not improve in six consecutive
The features map generated by the GAP layer is fed into the epochs, the learning rate of the model was updated to 75%
fully connected layer (FC) layer. In a FC layer, the neurons of its previous value. We allowed early stopping, and training
in one layer is connected to the neurons in another layer. The was halted if the validation loss did not improve for thirty
FC layer also behaves like a convolution layer with filter of consecutive epochs. The very small real numbers that come
size 1×1. from normal distribution were initially assigned to the network
Dropout Layer weights with a weight decay rate of 1 × 10−6 . The model
TABLE I: PERFORMANCE OF THE PROPOSED
MODEL
Dataset Category Accuracy(%)
Digit 100
MU HandImages ASL
Alphabet 99.95
Sign Language Digits Digit 99.90
ASL Alphabet Alphabet 100
Finger Spelling Alphabet 99.99

was trained on a desktop computer under 64 bit Windows 10

environment with NVIDIA Titan Xp PRO 12 bit GPU, 3.98
CPU, 8 GB RAM, and 1 TB HDD. The training of the model
was completed within 150 epochs.
The performance of the proposed model was evaluated for
recognition of digits and alphabet of each dataset separately Fig. 4: Performance curve of the proposed model for recogni-
using 10 fold cross-validation. Besides this, its performance tion of alphabet of the MU HandImages ASL dataset. From the
was evaluated also for the case of mixing digits and alphabet figure, it is seen that both training and validation performance
of each dataset. curve converges to 100% at epoch 50. Hence, the model
converges very rapidly.
VI. R ESULTS
The training and validation accuracy of the model on MU TABLE II: PERFORMANCE OF THE PROPOSED
HandImages ASL digit dataset has been depicted in Fig. 4. MODEL WHEN DIGIT AND ALPHABET DATASETS
The average accuracy of the proposed model for recognition ARE COMBINED
of ASL of every dataset has been presented in Table I. It is Dataset Category Accuracy(%)
observed that the model recognized both digit and alphabet MU HandImages ASL Alhanumeric 99.92
signs of every dataset with about 100% accuracy. The lowest Sign Language Digits
Alphanumeric 99.90
and ASL Alphabet
accuracy (99.90%) has been reported for the digits of sign Sign Language Digits
language digits dataset. On the other hand, the digits of MU Alphanumeric 99.90
and Finger Spelling
HandImages ASL dataset has been recognized with 100%
accuracy. This very good results may be due to the application
of sufficient data augmentation that adds more variation in of test samples used to evaluate the model’s performance. To
the training samples to make the model capturing all possible compare the performance of the proposed model with some
changes. existing methods, we have chosen MU HandImages ASL and
The performance of the proposed model has been evaluated Finger Spelling datsets on which previous works have been
also in the case when the digits and alphabet have been mixed done by [4] and [5] using CNN. The comparison between the
together, and its performance has been represented in Table proposed SLRNet-8 model and the models proposed by Bheda
II. Since all datasets except MU HandImages ASL contain et al. [4] and Brandon et al. [5] has been mentioned in Table
either alphabet or numeral signs of ASL and sign language III. It is observed that out model has significantly (≥ 9%)
digits dataset contain only the digit signs of the ASL, we have improved the recognition accuracy reported by the previous
combined sign language digits with the ASL digits and finger models [4], [5] of CNN on the same dataset.
spelling datasets separately to evaluate the performance on
alphanumeric datasets. It is observed from Table I and Table VII. C ONCLUSION
II that the recognition accuracy of the model is slightly reduced American Sign Language is one of the most popular Sign
for alphanumeric recognition than the individual recognition Languages in the World. In this study, we proposed a con-
of digits or alphabet. The recognition rate on MU HandImages volutional neural network model (SLRNet-8) for automatic
ASL is reduced from 100% (for digits) to 99.92%. However, recognition of ASL, and its performance has been evaluated on
it is still very high, approximately 100% accuracy. Similarly,
the recognition rate of ASL (alphabet) of the finger spelling
dataset reduced from 99.99% to the average recognition rate of TABLE III: COMPARISON WITH PREVIOUS RE-
99.90%. Thus, we can conclude that the model performance is SEARCH
not affected significantly by mixing the digits with the alphabet Model Dataset Accuracy(%)
datasets. CNN [4] MU HandImages ASL 91.70
It is very difficult and not rational to strictly compare some Self-generated 89.75
MU HandImages ASL and
methods when they are not evaluated on the same dataset, CNN [5] 91.63
Finger Spelling
becuase the performance of a recognition method may vary CNN (Proposed)
MU HandImages ASL 99.92
due to the dataset used for training and also on the quality Finger Spelling 99.99
four ASL datasets of digits and alphabet. The performance of [20] S. Park and N. Kwak, “Analysis on the dropout effect in convolutional
the proposed model for ASL has been compared with some neural networks,” in Asian Conference on Computer Vision. Springer,
2016, pp. 189–204.
prominent works already reported on the same dataset. The [21] B. Ko, H.-G. Kim, K.-J. Oh, and H.-J. Choi, “Controlled dropout: A
proposed model has significantly improved the recognition different approach to using dropout on deep neural network,” in Big Data
accuracy of the ASL of some prominent works already exists and Smart Computing (BigComp), 2017 IEEE International Conference
on. IEEE, 2017, pp. 358–362.
in the literature. The model predicts every sign with 100% [22] X. Cui, V. Goel, and B. Kingsbury, “Data augmentation for deep
accuracy. In this study, we have considered only isolated digit convolutional neural network acoustic modeling,” in Acoustics, Speech
or letters of ASL from static images. In the future, the model and Signal Processing (ICASSP), 2015 IEEE International Conference
on. IEEE, 2015, pp. 4545–4549.
can be applied for the recognition of sentence-level continuous [23] M. M. Rahman, M. S. Islam, R. Sassi, and M. Aktaruzzaman,
words of ASL or for recognition of ASL from the video. “Convolutional neural networks performance comparison for
handwritten bengali numerals recognition,” SN Applied Sciences,
ACKNOWLEDGMENT vol. 1, no. 12, p. 1660, Nov 2019. [Online]. Available:
https://doi.org/10.1007/s42452-019-1682-y
We are thankful to the NVIDIA corporation for supporting [24] S. Mannor, D. Peleg, and R. Rubinstein, “The cross entropy method for
classification,” in Proceedings of the 22nd international conference on
the study by providing a GPU card. Machine learning. ACM, 2005, pp. 561–568.
[25] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
R EFERENCES arXiv:1412.6980, 2014.

[1] World Health Organization, WHO. [Online]. Avail-

able: https://www.who.int/news-room/fact-sheets/detail/deafness-and-
hearing-loss
[2] J. Cummins, “Bilingualism and second language learning,” Annual
Review of Applied Linguistics, vol. 13, pp. 50–70, 1992.
[3] National Association of Deaf. [Online]. Available: https://www.nad.org/
[4] V. Bheda and D. Radpour, “Using deep convolutional networks for
gesture recognition in american sign language,” arXiv:1710.06836, 2017.
[5] B. Garcia and S. A. Viesca, “Real-time american sign language recog-
nition with convolutional neural networks,” Convolutional Neural Net-
works for Visual Recognition, vol. 2, 2016.
[6] A. Barczak, N. Reyes, M. Abastillas, A. Piccio, and T. Susnjak, “A new
2d static hand gesture colour image dataset for asl gestures,” 2011.
[7] G. E. Hinton, S. Osindero, and Y.-W. Teh, “A fast learning algorithm for
deep belief nets,” Neural computation, vol. 18, no. 7, pp. 1527–1554,
2006.
[8] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol, “Extract-
ing and composing robust features with denoising autoencoders,” in
Proceedings of the 25th international conference on Machine learning.
ACM, 2008, pp. 1096–1103.
[9] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
applied to document recognition,” Proceedings of the IEEE, 1998.
[10] R. Vaillant, C. Monrocq, and Y. Le Cun, “Original approach for the
localisation of objects in images,” IEE Proceedings-Vision, Image and
Signal Processing, vol. 141, no. 4, pp. 245–250, 1994.
[11] C.-L. Huang and W.-Y. Huang, “Sign language recognition using model-
based tracking and a 3d hopfield neural network,” Machine vision and
applications, vol. 10, no. 5-6, pp. 292–307, 1998.
[12] H.-I. Suk, B.-K. Sin, and S.-W. Lee, “Hand gesture recognition based
on dynamic bayesian network framework,” Pattern recognition, vol. 43,
no. 9, pp. 3059–3072, 2010.
[13] M. S. Islalm, M. M. Rahman, M. H. Rahman, M. Arifuzzaman,
R. Sassi, and M. Aktaruzzaman, “Recognition bangla sign language
using convolutional neural network,” in 2019 International Conference
on Innovation and Intelligence for Informatics, Computing, and Tech-
nologies (3ICT), Sep. 2019, pp. 1–6.
[14] T. Starner, J. Weaver, and A. Pentland, “Real-time american sign
language recognition using desk and wearable computer based video,”
IEEE Transactions on pattern analysis and machine intelligence, vol. 20,
no. 12, pp. 1371–1375, 1998.
[15] F. Beşer, M. A. Kizrak, B. Bolat, and T. Yildirim, “Recognition of sign
language using capsule networks,” in 2018 26th Signal Processing and
Communications Applications Conference (SIU). IEEE, 2018, pp. 1–4.
[16] A. Deza and D. Hasan, “Mie324 final report: Sign language recognition.”
[17] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
network training by reducing internal covariate shift,” arXiv:1502.03167,
2015.
[18] A. F. Agarap, “Deep learning using rectified linear units (relu),”
arXiv:1803.08375, 2018.
[19] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
with deep convolutional neural networks,” in Advances in neural infor-
mation processing systems, 2012, pp. 1097–1105.

View publication stats

Sustainable Project Management. The GPM Reference Guide: March 2018
No ratings yet
Sustainable Project Management. The GPM Reference Guide: March 2018
26 pages
UNIT 1 Introduction Part 1
No ratings yet
UNIT 1 Introduction Part 1
37 pages
A Primer On Machine Learning in Subsurface Geosciences
100% (1)
A Primer On Machine Learning in Subsurface Geosciences
182 pages
Cap JosuEnfoquemultidisciplinar2
No ratings yet
Cap JosuEnfoquemultidisciplinar2
19 pages
Blockchain For The Internet of Vehicles: How To Use Blockchain To Secure Vehicle-to-Everything (V2X) Communication and Payment ?
No ratings yet
Blockchain For The Internet of Vehicles: How To Use Blockchain To Secure Vehicle-to-Everything (V2X) Communication and Payment ?
19 pages
Going Deeper With Convolutions: Wliu@cs - Unc.edu, Reedscott@umich - Edu
No ratings yet
Going Deeper With Convolutions: Wliu@cs - Unc.edu, Reedscott@umich - Edu
9 pages
7 CALDS - Context-Aware - Learning - From - Data - Streams
No ratings yet
7 CALDS - Context-Aware - Learning - From - Data - Streams
10 pages
1 s2.0 S0263822323004257 Main - 2
No ratings yet
1 s2.0 S0263822323004257 Main - 2
14 pages
Ssarolietal 2012PreventionofneointimalhyperplasiawithmodifiedstretchePTFEgraftsinanexperimentalpreclinicalstudyinswine
No ratings yet
Ssarolietal 2012PreventionofneointimalhyperplasiawithmodifiedstretchePTFEgraftsinanexperimentalpreclinicalstudyinswine
12 pages
UM23055 - Article-1 - Internet-of-Things Meets BPM
No ratings yet
UM23055 - Article-1 - Internet-of-Things Meets BPM
11 pages
Augmented Intelligence Surveys of Literature and Expert Opinion To Understand Relations Between Human Intelligence and Artificial Intelligence
No ratings yet
Augmented Intelligence Surveys of Literature and Expert Opinion To Understand Relations Between Human Intelligence and Artificial Intelligence
18 pages
Lenzietal 2019JSCR-Dietarystrategiesofmodernbodybuilders
No ratings yet
Lenzietal 2019JSCR-Dietarystrategiesofmodernbodybuilders
7 pages
الشبكات العصبية الاصطناعية واستخدامها في تميز بصمة الأصبع
No ratings yet
الشبكات العصبية الاصطناعية واستخدامها في تميز بصمة الأصبع
21 pages
Sousa... Figueiredo 2020 Seminarsindialysis
No ratings yet
Sousa... Figueiredo 2020 Seminarsindialysis
12 pages
I Eee Access Journal 3035729
No ratings yet
I Eee Access Journal 3035729
43 pages
Bare jrnl2
No ratings yet
Bare jrnl2
17 pages
Epigenetic Modulation of Adult Hippocampal Neurogenesis by Extremely Low-Frequency Electromagnetic Fields PDF
No ratings yet
Epigenetic Modulation of Adult Hippocampal Neurogenesis by Extremely Low-Frequency Electromagnetic Fields PDF
16 pages
Next Gen Farming
No ratings yet
Next Gen Farming
34 pages
4 Salazar Francisco C1 - W4 - Lab - 3
No ratings yet
4 Salazar Francisco C1 - W4 - Lab - 3
11 pages
Iot Based Air Quality and Weather Monitoring System With Android Application1
No ratings yet
Iot Based Air Quality and Weather Monitoring System With Android Application1
6 pages
Machine Learning Article PDF
No ratings yet
Machine Learning Article PDF
6 pages
CIMAT A Compute-In-Memory Architecture For On-Chip Training Based On Transpose SRAM Arrays
No ratings yet
CIMAT A Compute-In-Memory Architecture For On-Chip Training Based On Transpose SRAM Arrays
11 pages
Flow Chart:: Input Audio Preprocessing
No ratings yet
Flow Chart:: Input Audio Preprocessing
14 pages
Antioxidant and Hepatoprotective: Bangladesh Journal of Pharmacology February 2013
No ratings yet
Antioxidant and Hepatoprotective: Bangladesh Journal of Pharmacology February 2013
6 pages
Effect of Freezing Treatments and Yeast Amount On Sensory and Physical Properties of Sweet Bakery Products
No ratings yet
Effect of Freezing Treatments and Yeast Amount On Sensory and Physical Properties of Sweet Bakery Products
8 pages
Podisus Sagitta Does Not Occur in Brazil A Corregiendum To Oliveira Juniorj Et Al 2021
No ratings yet
Podisus Sagitta Does Not Occur in Brazil A Corregiendum To Oliveira Juniorj Et Al 2021
4 pages
Lolli 2009
No ratings yet
Lolli 2009
9 pages
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
No ratings yet
YOLO-LITE: A Real-Time Object Detection Algorithm Optimized For Non-GPU Computers
8 pages
Viral Genome Prediction From Raw Human DNA Sequence Samples by Combining Natural Language Proc-1
No ratings yet
Viral Genome Prediction From Raw Human DNA Sequence Samples by Combining Natural Language Proc-1
10 pages
Final Report
No ratings yet
Final Report
62 pages
JGR 2011 Osso Et Al
No ratings yet
JGR 2011 Osso Et Al
13 pages
ECG Heartbeat Classification: A Deep Transferable Representation
No ratings yet
ECG Heartbeat Classification: A Deep Transferable Representation
5 pages
Effect of Inlet and Outlet Locations On Transverse
No ratings yet
Effect of Inlet and Outlet Locations On Transverse
12 pages
The Potential Role of Antioxidants in Metabolic Syndrome: Current Pharmaceutical Design December 2015
No ratings yet
The Potential Role of Antioxidants in Metabolic Syndrome: Current Pharmaceutical Design December 2015
12 pages
Symbiosis2015Vol65PP17 1
No ratings yet
Symbiosis2015Vol65PP17 1
8 pages
Traction Power Systems - Research Paper
No ratings yet
Traction Power Systems - Research Paper
12 pages
Face Recognition Based Attendance System
No ratings yet
Face Recognition Based Attendance System
6 pages
Getting started with Deep Learning for Natural Language Processing: Learn how to build NLP applications with Deep Learning (English Edition)
From Everand
Getting started with Deep Learning for Natural Language Processing: Learn how to build NLP applications with Deep Learning (English Edition)
Sunil Patel
No ratings yet
Module 4 Fuzzy Logic, Neural NW Leture Notes. 16861418577272
No ratings yet
Module 4 Fuzzy Logic, Neural NW Leture Notes. 16861418577272
8 pages
Apagar TB
No ratings yet
Apagar TB
10 pages
Assessment of Isolated and Non-Isolated DC-DC Converters For Medium-Voltage PV Applications
No ratings yet
Assessment of Isolated and Non-Isolated DC-DC Converters For Medium-Voltage PV Applications
7 pages
TRR File
No ratings yet
TRR File
9 pages
Wu 2018
No ratings yet
Wu 2018
5 pages
Civitas: The Smart City Middleware, From Sensors To Big Data
No ratings yet
Civitas: The Smart City Middleware, From Sensors To Big Data
7 pages
Syllabus
No ratings yet
Syllabus
14 pages
Design and Development of Arduino Based Contactless Thermometer
No ratings yet
Design and Development of Arduino Based Contactless Thermometer
5 pages
Silva Etal 2013
No ratings yet
Silva Etal 2013
8 pages
Mastering Large Language Models: Advanced techniques, applications, cutting-edge methods, and top LLMs (English Edition)
From Everand
Mastering Large Language Models: Advanced techniques, applications, cutting-edge methods, and top LLMs (English Edition)
Sanket Subhash Khandare
No ratings yet
Transfer Learning With ResNet-50 For Malaria Cell-Image Classification
No ratings yet
Transfer Learning With ResNet-50 For Malaria Cell-Image Classification
6 pages
I JP Am Image Classification
No ratings yet
I JP Am Image Classification
15 pages
Icuas2016 Saa
No ratings yet
Icuas2016 Saa
11 pages
MNIST Handwritten Digit Recognition With Different CNN Architectures
No ratings yet
MNIST Handwritten Digit Recognition With Different CNN Architectures
4 pages
Deep Learning-Based Detection of One and Two-Column Textual Blocks in Camera-Captured Pashto Documents Images
No ratings yet
Deep Learning-Based Detection of One and Two-Column Textual Blocks in Camera-Captured Pashto Documents Images
10 pages
Tunnelqnn: A Hybrid Quantum-Classical Neural Network For Efficient Learning
No ratings yet
Tunnelqnn: A Hybrid Quantum-Classical Neural Network For Efficient Learning
11 pages
Cork Properties Capabilities and Applications
No ratings yet
Cork Properties Capabilities and Applications
22 pages
Postpartum Hemorrhage: New Insights For Definition and Diagnosis
No ratings yet
Postpartum Hemorrhage: New Insights For Definition and Diagnosis
8 pages
IEEEMetroAeroSpace2014 SAAMARKED
No ratings yet
IEEEMetroAeroSpace2014 SAAMARKED
7 pages
Bitcoin Price Forecasting A Comparative Study Between Statistical and Machine Learning Methods
No ratings yet
Bitcoin Price Forecasting A Comparative Study Between Statistical and Machine Learning Methods
3 pages
10 1109vdat50263 2020 9190274
No ratings yet
10 1109vdat50263 2020 9190274
6 pages
Peniche 5
No ratings yet
Peniche 5
26 pages
Soil Analysis and Crop Recommendation Using Machine Learning
No ratings yet
Soil Analysis and Crop Recommendation Using Machine Learning
7 pages
AI Engineer Road Map 2024
No ratings yet
AI Engineer Road Map 2024
9 pages
Consumers' Preference Index of Some Selected Sweetmeat Products Available in Bangladesh
No ratings yet
Consumers' Preference Index of Some Selected Sweetmeat Products Available in Bangladesh
8 pages
Problem Analysis and Probable Solution of Existing Interconnected Grid System of Chittagong Region
No ratings yet
Problem Analysis and Probable Solution of Existing Interconnected Grid System of Chittagong Region
5 pages
CS-601 Machine Learning Study Guide
No ratings yet
CS-601 Machine Learning Study Guide
2 pages
Artigo Mioranza ControlofMincognitaintomatoplantswithThuyaoccidentalis
No ratings yet
Artigo Mioranza ControlofMincognitaintomatoplantswithThuyaoccidentalis
15 pages
Foundations and Challenges of Low-Inertia Systems PDF
No ratings yet
Foundations and Challenges of Low-Inertia Systems PDF
26 pages
9-Variable Step Size P&O MPPT Algorithm For Optimal Power Extraction of Multi-Phase PMSG Based Wind Generation System
No ratings yet
9-Variable Step Size P&O MPPT Algorithm For Optimal Power Extraction of Multi-Phase PMSG Based Wind Generation System
15 pages
869 When Vision Transformers Outpe
No ratings yet
869 When Vision Transformers Outpe
20 pages
EPand Rehab HMSC
No ratings yet
EPand Rehab HMSC
17 pages
Ieee
No ratings yet
Ieee
5 pages
Risk and Resilience in Practice: Cultural Heritage Buildings
No ratings yet
Risk and Resilience in Practice: Cultural Heritage Buildings
5 pages
(Ferraz Et Al. 2021) Calibration and Validation of The SWAT Model For Hydrological
No ratings yet
(Ferraz Et Al. 2021) Calibration and Validation of The SWAT Model For Hydrological
11 pages
Top 10 Deep Learning Algorithms You Should Know in 2023
No ratings yet
Top 10 Deep Learning Algorithms You Should Know in 2023
14 pages
Neural Networks for Beginners: Introduction to Machine Learning and Deep Learning
From Everand
Neural Networks for Beginners: Introduction to Machine Learning and Deep Learning
daniel Huston
No ratings yet
Applied HuggingSound for Speech Recognition: The Complete Guide for Developers and Engineers
From Everand
Applied HuggingSound for Speech Recognition: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Real Time Face Mask Project Report - Final
100% (1)
Real Time Face Mask Project Report - Final
57 pages
Speech-to-Text Systems and Technologies: Definitive Reference for Developers and Engineers
From Everand
Speech-to-Text Systems and Technologies: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
From Everand
Text-to-Speech Systems and Algorithms: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Natural Language Processing: Fundamentals and Applications
From Everand
Natural Language Processing: Fundamentals and Applications
Fouad Sabry
No ratings yet
Speech Recognition: Fundamentals and Applications
From Everand
Speech Recognition: Fundamentals and Applications
Fouad Sabry
No ratings yet
Silent Speech Interface: Fundamentals and Applications
From Everand
Silent Speech Interface: Fundamentals and Applications
Fouad Sabry
No ratings yet
Terminology Extraction: Fundamentals and Applications
From Everand
Terminology Extraction: Fundamentals and Applications
Fouad Sabry
No ratings yet
Natural Language Understanding: Fundamentals and Applications
From Everand
Natural Language Understanding: Fundamentals and Applications
Fouad Sabry
No ratings yet
Deep Learning: Fundamentals and Applications
From Everand
Deep Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Semantic Network: Fundamentals and Applications
From Everand
Semantic Network: Fundamentals and Applications
Fouad Sabry
No ratings yet
Language Identification: Fundamentals and Applications
From Everand
Language Identification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Statistical Semantics: Fundamentals and Applications
From Everand
Statistical Semantics: Fundamentals and Applications
Fouad Sabry
No ratings yet
Relationship Extraction: Fundamentals and Applications
From Everand
Relationship Extraction: Fundamentals and Applications
Fouad Sabry
No ratings yet
Natural Language User Interface: Fundamentals and Applications
From Everand
Natural Language User Interface: Fundamentals and Applications
Fouad Sabry
No ratings yet

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

A New Benchmark On American Sign Language Recognition Using Convolutional Neural Network

Uploaded by

A New Benchmark On American Sign Language Recognition Using Convolutional Neural Network

Uploaded by

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

A New Benchmark on American Sign Language Recognition using Convolutional

Conference Paper · April 2020

Md Moklesur Rahman Md Shafiqul Islam

SEE PROFILE SEE PROFILE

Md. Hafizur Rahman Roberto Sassi

SEE PROFILE SEE PROFILE

The user has requested enhancement of the downloaded file.

A New Benchmark on American Sign Language

{∗ moklesur.ai, † msislam.iu, ‡ hafizur.iueee}@gmail.com, {§ roberto.sassi, ¶ massimo.rivolta}@unimi.it and

978-1-7281-6099-3/19/$31.00 © 2019 IEEE

II. R ELATED W ORKS

IV. P ROPOSED M ETHOD

The methodological steps of the ASL recognition system

Fig. 3: Architecture of the proposed SLRNet-8 CNN. In this

was trained on a desktop computer under 64 bit Windows 10

[1] World Health Organization, WHO. [Online]. Avail-

View publication stats

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.