Skip to main content

Advertisement

Log in

Vision-based continuous sign language recognition using multimodal sensor fusion

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

The indispensable means of communication for deaf people is sign language. Given the familiarity lack of hearing people with the specific language practiced by deaf people, establishing an interpretation system which make easier the communication between deaf people and the social environment gives the impression of being necessary. The main challenge in such a system is to identify each sign in continuous sign language videos. Therefore, this work presents a computer vision based system to recognize the signs in continuous sign language video. This system is based on two main phases ; sign words extraction and their classification. The most challenging task in this process is separating sign words from video sequences. For this purpose, we present a new algorithm able to detect accurate words boundaries in a continuous sign language video. Using hand shape and motion features, this algorithm extract isolate signs from video and it shows better efficiency compared to other works presented in the literature. In the recognition phase, the extracted signs are classified and recognized using Hidden Markov Model (HMM) and it has been strongly adopted after testing other approaches such as Independent Bayesian Classifier Combination (IBCC). Our system manifests auspicious performance with recognition accuracy of 95.18% for one gestures and 93.87% for two hand gestures. Comparing to systems using only manual features, the proposed framework reaches 2.24% and 2.9% progress on one and two hand gestures respectively, when employing head pose and eye gaze features. These results are reached based on dataset containing 33 isolated signs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Aktas M, Gokberk B, Akarun L (2019) Recognizing Non-Manual Signs in Turkish Sign Language, pp 1–6

  • Al-Osaimi FR, Bennamoun M, Mian A (2012) Spatially optimized data-level fusion of texture and shape for face recognition. IEEE Trans Image Process 21(2):859–872

    Article  MathSciNet  Google Scholar 

  • Bilal S, Akmeliawati R, Shafie AA (2013) Hidden Markov model for human to computer interaction: a study on human hand gesture recognition. Artif Intell Rev 40:495–516. https://doi.org/10.1007/s10462-011-9292-0

    Article  Google Scholar 

  • Chong T, Lee BG (2018) American sign language recognition using leap motion controller with machine learning approach. Sensors 2018:18

    Google Scholar 

  • Cooper H., Bowden R.: Sign language recognition using linguistically derived sub-units. In: Proceedings of 4th workshop on the representation and processing of sign languages: corpora and sign language technologies, pp 57-61 (2010)

  • Cui Y, Weng J (2014) Chinese sign language recognition with 3D hand motion trajectories and depth images. In: Proceeding of the 11th World Congress on Intelligent Control and Automation. Shenyang, China

  • Cui R, Liu H, Zhang C (2019) A deep neural framework for continuous sign language recognition by iterative training. IEEE Trans Multimedia 21(7):1880–1891

    Article  Google Scholar 

  • Dai Q, Hou J, Yang P, Li X, Wang F, Zhang X (2017) Demo: the sound of silence: end-to-end sign language recognition using smartwatch, pp 462–464

  • de Souza CR, Pizzolato EB (2013) Sign Language Recognition with Support Vector Machines and Hidden Conditional Random Fields: Going from Fingerspelling to Natural Articulated Words. In: Perner P (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science, vol 7988. Springer, Berlin, Heidelberg

  • Deng X, Yang S, Zhang Y, Tan P, Chang L, Wang H (2017) Hand3D: Hand Pose Estimation using 3D Neural Network. arXiv:1704.02224

  • Eqab A, Shanableh T (2017) Android mobile app for real-time bilateral Arabic sign language translation using leap motion controller, pp 1–5. 10.1109/ICECTA.2017.8251936

  • Franco R, Facundo Q, Cesar E, Laura L (2016) Handshape recognition for Argentinian Sign Language using ProbSom. J Comput Sci Technol 16(1):1–5

    Google Scholar 

  • Ferreira PM, Cardoso JS, Rebelo A (2019) On the role of multimodal learning in the recognition of sign language. Multimed Tools Appl 78:10035–10056

    Article  Google Scholar 

  • Ferreira PM, Cardoso J, Rebelo A (2017) Multimodal Learning for Sign Language. Recognition 2017:313–321

    MathSciNet  Google Scholar 

  • Garcia B, Viesca S (2016) Real-time american sign language recognition with convolutional neural networks. In: Reports, Stanford University, Stanford, CA, USA

  • Guo H, Wang G, Chen X, Zhang C, Qiao F, Yang H (2017) Region ensemble network: Improving convolutional network for hand pose estimation. In: Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, pp 4512–4516

  • Guo H, Wang G, Chen X, Zhang C (2017) Towards Good Practices for Deep 3D Hand Pose Estimation. arXiv:1707.07248v1

  • Hassan S, Abolarinwa J, Alenoghena C, Salihu B, David M, Farzamnia A (2017) Intelligent sign language recognition using enhanced fourier descriptor: a case of Hausa sign language, pp 104–109

  • Hassan M, Assaleh K, Shanableh T (2019) Multiple proposals for continuous arabic sign language recognition. Sens Imaging 2019:20

    Google Scholar 

  • Hu MK (1962) Visual pattern recognition by moment invariants. IRE Trans Inf Theory 8:179–187

    MATH  Google Scholar 

  • Huang J, Zhou W, Zhang Q Li, Houqiang Li W (2018) Video-based sign language recognition without temporal segmentation

  • Jakub G, Mariusz M, Mateusz Z, Katarzyna B (2016) Inertial motion sensing glove for sign language gesture acquisition and recognition. IEEE Sens J 16(16):6310–6316

    Article  Google Scholar 

  • Jian W, Zhongjun T, Lu S, Leonardo E, Roozbeh J (2015) Real-time American Sign Language Recognition using wrist-worn motion and surface EMG sensors. In: 12th International Conference on Wearable and Implantable Body Sensor Networks (BSN), Cambridge, MA, USA

  • Joze H, Koller O (2018) MS-ASL: a large-scale data set and benchmark for understanding american sign language

  • Koller O, Zargaran O, Ney H, Bowden R (2016) Deep Sign: Hybrid CNN-HMM for Continuous Sign Language Recognition In: The British Machine Vision Conference (BMVC) York

  • Li SZ, Yu B, Wu W, Su SZ, Ji RR (2015) Feature learning based on SAE-PCA network for human gesture recognition in RGBD images. Neurocomputing 151:565–573

    Article  Google Scholar 

  • Lim K, Tan A, Tan S (2016) A feature covariance matrix with serial particle filter for isolated sign language recognition. Expert Syst Appl 2016:54

    Google Scholar 

  • Malawski F, Galka J (2018) System for multimodal data acquisition for human action recognition. Multimedia Tools Appl 2018:77

    Google Scholar 

  • Pugeault N, Bowden R (2011) Spelling It Out: Real-Time ASL Fingerspelling Recognition. In: Proceedings of the 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision, jointly with ICCV, Barcelona, Spain, pp 6–13

  • Rabiner L (1989) A tutorial on hidden Markov models and selected applications in speech recognition. Proc IEEE 77(2):257–286

    Article  Google Scholar 

  • Rao KG (2017) Selfie video based continuous Indian sign language recognition system. Ain Shams Eng J 2017:9

    Google Scholar 

  • Ren Z, Yuan J, Meng J, Zhang Z (2013) Robust part-based hand gesture recognition using kinect sensor. IEEE Trans Multimed 15(5):1110–1120

    Article  Google Scholar 

  • Ridwang Syafaruddin, Ilham Amil, Nurtanio Ingrid (2019) Indonesian sign language letter interpreter application using leap motion control based on native bayes classifier. IOP Conf Ser Mater Sci Eng

  • Sako S, Kitamura T (2013) Subunit Modeling for Japanese Sign Language Recognition Based on Phonetically Depend Multi-stream Hidden Markov Models. In: Stephanidis C., Antona M (eds) Universal Access in Human-Computer Interaction. Design Methods, Tools, and Interaction Techniques for eInclusion. UAHCI 2013. Lecture Notes in Computer Science, vol 8009. Springer, Berlin, Heidelberg

  • Senin P (2008) Dynamic time warping algorithm review. Honolulu, USA

    Google Scholar 

  • Siddiqui N, Chan R (2020) Multimodal hand gesture recognition using single IMU and acoustic measurements at wrist. PLOS ONE 2020:15

    Google Scholar 

  • Simpson E, Roberts SJ, Smith A, Lintott C (2011) Bayesian combination of multiple, imperfect classifiers. Neural Inf Process Syst 2011:5

    Google Scholar 

  • Trigueiros P, Ribeiro F, Reis LP (2014) Vision Based Referee Sign Language Recognition System for the RoboCup MSL League. In: Behnke S, Veloso M, Visser A, Xiong R (eds) RoboCup 2013: Robot World Cup XVII. RoboCup 2013. Lecture Notes in Computer Science, vol 8371. Springer, Berlin, Heidelberg

    Google Scholar 

  • Turk M, Pentland A (1991) Eigenfaces for recognition. J Cogn Neurosci 3(1):71–86

    Article  Google Scholar 

  • Vaitkevicius A, Taroza M, Blazauskas T, Damasevicius R, Maskeliunas R (2019) Recognition of American sign language gestures in a virtual reality using leap motion. Appl Sci 2019:9

    Google Scholar 

  • Ville V, Matti K, Jorma L (2014) Experiments on recognising the handshape in blobs extracted from sign language Videos. In: 22nd International Conference on Pattern Recognition. Stockholm, Sweden

  • Wang X, Kankanhalli M (2010) Multifusion: a boosting approach for multimedia fusion. ACM Trans Multimed Comput Commun Appl 6:4

    Article  Google Scholar 

  • Yang W, Tao J, Ye Z (2016) Continuous sign language recognition using level building based on fast hidden Markov model. Pattern Recogn Lett 78:28–35

    Article  Google Scholar 

  • Zahn CT, Roskies RZ (1972) Fourier descriptors for plane closed curves. IEEE Trans Comput 100:269–281

    Article  MathSciNet  Google Scholar 

  • Zernike VF (1934) Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode. Physica 1:689–704

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed Jemni.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jebali, M., Dakhli, A. & Jemni, M. Vision-based continuous sign language recognition using multimodal sensor fusion. Evolving Systems 12, 1031–1044 (2021). https://doi.org/10.1007/s12530-020-09365-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-020-09365-y

Keywords

Navigation

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy