A Feature Fusion Network for Skeleton-Based Gesture Recognition

You, Xiaowen; Gao, Qing; Gao, Hongwei; Ju, Zhaojie

doi:10.1007/978-981-99-6486-4_6

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14268))

Included in the following conference series:

International Conference on Intelligent Robotics and Applications

902 Accesses

Abstract

With the development of the times, the requirements for human-computer interaction methods have gradually increased, and naturalness and comfort are constantly pursued on the basis of traditional precision. Gesture is one of the innate ways of human communication, which is highly intuitive and can be employed as an effective means of natural human-computer interaction. In this paper, dynamic gestures are investigated based on the 3D skeletal information of gestures, and different cropping boxes are placed at generate global and local datasets respectively according to whether they depend on the motion trajectory of the gesture. By analyzing the geometric features of skeletal sequences, a dual-stream 3D CNN (Double_C3D) framework is proposed for fusion at the feature level, which relies on 3D heat map video streams and uses the video streams as the input to the network. Finally, the Double_C3D framework was evaluated on the SHREC dynamic gesture recognition dataset and the JHMBD dynamic behavior recognition dataset with an accuracy of 91.72% and 70.54%, respectively.

^☆ The authors would like to acknowledge the support from the National Natural Science Foundation of China (62006204, 52075530), the Guangdong Basic and Applied Basic Research Foundation (2022A1515011431), and Shenzhen Science and Technology Program (RCBS20210609104516043, JSGG20210802154004014). This work is also partially supported by the AiBle project co-financed by the European Regional Development Fund.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Decoupled Representation Network for Skeleton-Based Hand Gesture Recognition

Skeleton-Based Hand Gesture Recognition by Using Multi-input Fusion Lightweight Network

Motion feature estimation using bi-directional GRU for skeleton-based dynamic hand gesture recognition

Article 07 April 2024

References

Oudah, M., Al-Naji, A. and Chahl, J.: Hand gesture recognition based on computer vision: a review of techniques. J. Imaging 6(8), 73 (2020)
Google Scholar
Gao, Q., Chen, Y., Ju, Z., et al.: Dynamic hand gesture recognition based on 3D hand pose estimation for human-robot interaction. IEEE Sens. J. 22(18), 17421–17430 (2021)
Article MATH Google Scholar
Yu, J., Gao, H., Zhou, D., Liu, J., Gao, Q., Ju, Z.: Deep temporal model-based identity-aware hand detection for space human-robot interaction. IEEE Trans. Cybern. 52(12), 13738–13751 (2022). https://doi.org/10.1109/TCYB.2021.3114031
Article MATH Google Scholar
Kolkur, S., Kalbande, D., Shimpi, P., et al.: Human skin detection using RGB, HSV and YCbCr color models. arXiv preprint arXiv:1708.02694 (2017)
Wu, Z., Xiong, C., Jiang, Y.G., et al.: Liteeval: A coarse-to-fine framework for resource efficient video recognition. Adv. Neural Inform. Process. Syst. 32 (2019)
Google Scholar
Hongchao, S., Hu, Y., Guoqing, Z., et al.: Behavior Identification based on Improved Two-Stream Convolutional Networks and Faster RCNN. In: 2021 33rd Chinese Control and Decision Conference (CCDC). IEEE, pp. 1771–1776 (2021)
Google Scholar
Wang, P., Li, W., Ogunbona, P., et al.: RGB-D-based human motion recognition with deep learning: a survey. Comput. Vis. Image Underst. 171, 118–139 (2018)
Article MATH Google Scholar
Xiao, F., Lee, Y.J., Grauman, K., et al.: Audiovisual slowfast networks for video recognition. arXiv preprint arXiv:2001.08740 (2020)
De Smedt, Q., Wannous, H., Vandeborre, J.P.: Skeleton-based dynamic hand gesture recognition.In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–9 (2016)
Google Scholar
Cai, J., Jiang, N., Han, X., et al.: JOLO-GCN: mining joint-centered light-weight information for skeleton-based action recognition. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 2735–2744 (2021)
Google Scholar
Caputo, F.M., Prebianca, P., Carcangiu, A., et al.: A 3 cent recognizer: simple and effective retrieval and classification of mid-air gestures from single 3D traces. In: STAG, pp. 9–15 (2017)
Google Scholar
Chen, X., Wang, G., Guo, H., et al.: Mfa-net: Motion feature augmented network for dynamic hand gesture recognition from skeletal data. Sensors 19(2), 239 (2019)
Article MATH Google Scholar
Li, M., Chen, S., Chen, X., et al.: Actional-structural graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3595–3603 (2019)
Google Scholar
Si, C., Chen, W., Wang, W., et al.: An attention enhanced graph convolutional lstm network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1227–1236 (2019)
Google Scholar
Yang, H., Yan, D., Zhang, L., et al.: Feedback graph convolutional network for skeleton-based action recognition. IEEE Trans. Image Process. 31, 164–175 (2021)
Article MATH Google Scholar
Deng, Z., Gao, Q., Ju, Z., et al.: Skeleton-based multifeatures and multistream network for real-time action recognition. IEEE Sens. J. 23(7), 7397–7409 (2023)
Article MATH Google Scholar
Zolfaghari, M., Oliveira, G.L., Sedaghat, N., et al.: Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2904–2913 (2017)
Google Scholar
Baradel, F., Wolf, C., Mille, J.: Human action recognition: pose-based attention draws focus to hands. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 604–613 (2017)
Google Scholar
Nunez, J.C., Cabido, R., Pantrigo, J.J., et al.: Convolutional neural networks and long short-term memory for skeleton-based human activity and hand gesture recognition. Pattern Recogn. 76, 80–94 (2018)
Article Google Scholar
Yu, J., Gao, H., Chen, Y., Zhou, D., Liu, J., Ju, Z.: Adaptive spatiotemporal representation learning for skeleton-based human action recognition. IEEE Trans. Cogn. Develop. Syst. 14(4), 1654–1665 (2022). https://doi.org/10.1109/TCDS.2021.3131253
Article MATH Google Scholar
Devineau, G., Xi, W., Moutarde, F., et al.: Convolutional neural networks for multivariate time series classification using both inter-and intra-channel parallel convolutions. In: Reconnaissance des Formes, Image, Apprentissage et Perception (RFIAP'2018) (2018)
Google Scholar
De Smedt, Q., Wannous, H., Vandeborre, J.P., et al.: Shrec'17 track: 3d hand gesture recognition using a depth and skeletal dataset. In: 3DOR-10th Eurographics Workshop on 3D Object Retrieval, pp. 1–6 (2017)
Google Scholar
Jhuang, H., Gall, J., Zuffi, S., et al.: Towards understanding action recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3192–3199 (2017)
Google Scholar
Köpüklü, O., Gunduz, A., Kose, N., Rigoll, G.: 2019 Real-time hand gesture detection and classification using convolutional neural networks. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG IEEE 2019), pp. 1–8 (2019)
Google Scholar
Dhingra, N., Kunz, A.: Res3atn-deep 3D residual attention network for hand gesture recognition in videos. In: 2019 International Conference on 3D Vision (3DV). IEEE, pp. 491–501 (2019)
Google Scholar
Dadashzadeh, A., et al.: HGR‐Net: a fusion network for hand gesture segmentation and recognition. IET Comput. Vision 13(8), 700–707 (2019)
Google Scholar
Chen, Y., Ding, Z., Chen, Y. L., et al.: Rapid recognition of dynamic hand gestures using leap motion. In: 2015 IEEE International Conference on Information and Automation. IEEE, 2015: 1419–1424
Google Scholar
Choutas, V., Weinzaepfel, P., Revaud, J., et al.: Potion: pose motion representation for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7024–7033 (2018)
Google Scholar
Shi, L., Zhang, Y., Cheng, J., et al.: Decoupled spatial-temporal attention network for skeleton-based action recognition. arXiv preprint arXiv:2007.03263 (2020)
Zhang, S., Yang, Y., Xiao, J., et al.: Fusing geometric features for skeleton-based action recognition using multilayer LSTM networks. IEEE Trans. Multimedia 20(9), 2330–2343 (2018)
Article MATH Google Scholar
Zhang, S., Liu, X., Xiao, J.: On geometric features for skeleton-based action recognition using multilayer LSTM networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, pp.148–157 (2017)
Google Scholar
Duan, H., Zhao, Y., Chen, K., et al.: Revisiting skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2969–2978 (2022)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., et al.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Google Scholar
Ludl, D., Gulde, T., Curio, C.: Simple yet efficient real-time pose-based action recognition. In: 2019 IEEE Intelligent Transportation Systems Conference (ITSC). IEEE, pp. 581–588 (2019)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Automation and Electrical Engineering, Shenyang Ligong University, Shenyang, 110159, China
Xiaowen You & Hongwei Gao
School of Electronics and Communication Engineering, Sun Yat-Sen University, Shenzhen, 518107, China
Qing Gao
School of Computing, University of Portsmouth, Portsmouth, PO13HE, UK
Zhaojie Ju

Authors

Xiaowen You
View author publications
You can also search for this author in PubMed Google Scholar
Qing Gao
View author publications
You can also search for this author in PubMed Google Scholar
Hongwei Gao
View author publications
You can also search for this author in PubMed Google Scholar
Zhaojie Ju
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Qing Gao or Zhaojie Ju .

Editor information

Editors and Affiliations

Zhejiang University, Hangzhou, China
Huayong Yang
Harbin Institute of Technology, Shenzhen, China
Honghai Liu
Zhejiang University, Hangzhou, China
Jun Zou
Huazhong University of Science and Technology, Wuhan, China
Zhouping Yin
Shenyang Institute of Automation, Shenyang, Liaoning, China
Lianqing Liu
Zhejiang University, Hangzhou, China
Geng Yang
Zhejiang University, Hangzhou, China
Xiaoping Ouyang
Harbin Institute of Technology, Shenzhen, China
Zhiyong Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

You, X., Gao, Q., Gao, H., Ju, Z. (2023). A Feature Fusion Network for Skeleton-Based Gesture Recognition. In: Yang, H., et al. Intelligent Robotics and Applications. ICIRA 2023. Lecture Notes in Computer Science(), vol 14268. Springer, Singapore. https://doi.org/10.1007/978-981-99-6486-4_6

Download citation

DOI: https://doi.org/10.1007/978-981-99-6486-4_6
Published: 10 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6485-7
Online ISBN: 978-981-99-6486-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Feature Fusion Network for Skeleton-Based Gesture Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Decoupled Representation Network for Skeleton-Based Hand Gesture Recognition

Skeleton-Based Hand Gesture Recognition by Using Multi-input Fusion Lightweight Network

Motion feature estimation using bi-directional GRU for skeleton-based dynamic hand gesture recognition

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

A Feature Fusion Network for Skeleton-Based Gesture Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Decoupled Representation Network for Skeleton-Based Hand Gesture Recognition

Skeleton-Based Hand Gesture Recognition by Using Multi-input Fusion Lightweight Network

Motion feature estimation using bi-directional GRU for skeleton-based dynamic hand gesture recognition

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.