Deep Spatio-Temporal Decision Fusion Network for Facial Expression Recognition

Chen, Xuanchi; Yang, Heng; Zhang, Xia; Zheng, Xiangwei; Li, Wei

doi:10.1007/978-3-031-20102-8_9

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13657))

Included in the following conference series:

International Conference on Machine Learning for Cyber Security

832 Accesses
1 Citations

Abstract

Facial expression recognition (FER) becomes research focus in affective computing, as it already plays an important role in public security application scenarios such as urban safety management and safety driving assistance systems. Modeling the spatiotemporal information of facial expression sequences in a targeted manner, integrating and utilizing them appropriately is challenging. In this paper, a facial expression recognition method based on spatial-temporal decision fusion network (STDFN) is proposed. Firstly, the facial expression sequences are divided into four sub-sequences according to face regions, and BiLSTM are used for each of sub-sequences to extract local temporal features. The local morphological features of facial expressions can be captured in more detail to maximize the utilization of the temporal features of dynamic facial expressions. Then, VGG19 is utilized to extract the shallow spatial features of peak expression frame, and the channel weights of spatial features is assigned by squeeze-and-excitation module to attain the weighted spatial features. This allows valid spatial features to be purposefully retained to avoid overfitting. Finally, temporal features and spatial features are used separately calculating expression classification results. And a decision-level fusion module is designed to fuse the two results to obtain the final FER result. Extensive experimental results demonstrate that on three FER datasets CK+, Oulu-CASIA and MMI, achieves 98.83%, 89.31% and 82.86% accuracy, which proved that STDFN effectively improved the recognition accuracy of FER.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Mix Fusion Spatial-Temporal Network for Facial Expression Recognition

Facial Expression Recognition Based on Spatial-Temporal Fusion with Attention Mechanism

Article 25 December 2022

Deep convolutional BiLSTM fusion network for facial expression recognition

Article 01 February 2019

References

Tayibnapis, I.R., Koo, D.Y., Choi, M.K., et al.: A novel driver fatigue monitoring using optical imaging of face on safe driving system. In: 2016 International Conference on Control, Electronics, Renewable Energy and Communications (ICCEREC), pp. 115–120 (2016)
Google Scholar
Poria, S., Cambria, E., Bajpai, R., et al.: A review of affective computing: from unimodal analysis to multimodal fusion. Inf. Fusion 37, 98–125 (2017)
Article Google Scholar
Li, S., Deng, W.: Deep facial expression recognition: a survey. IEEE Trans. Affect. Comput. (2020)
Google Scholar
Zhang, K., Huang, Y., Du, Y., et al.: Facial expression recognition based on deep evolutional spatial-temporal networks. IEEE Trans. Image Process. 26(9), 4193–4203 (2017)
Article MathSciNet MATH Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141. IEEE (2018)
Google Scholar
Klaser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3D-gradients. In: BMVC 2008-19th British Machine Vision Conference (BMVC), vol. 275, pp. 1–10. British Machine Vision Association (2008)
Google Scholar
Liu, M., Shan, S., Wang, R., et al.: Learning expression lets on spatio-temporal manifold for dynamic facial expression recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1749–1756. IEEE (2014)
Google Scholar
Jung, H., Lee, S., Yim, J., et al.: Joint fine-tuning in deep neural networks for facial expression recognition. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 2983–2991. IEEE (2015)
Google Scholar
Zhang, T., Zheng, W., Cui, Z., et al.: Spatial–temporal recurrent neural network for emotion recognition. IEEE Trans. Cybern. 49(3), 839–847 (2018)
Article Google Scholar
Zhao, X., et al.: Peak-piloted deep network for facial expression recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 425–442. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_27
Chapter Google Scholar
Yang, H., Ciftci, U., Yin, L.: Facial expression recognition by de-expression residue learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2168–2177 (2018)
Google Scholar
Lucey, P., Cohn, J.F., Kanade, T., et al.: The extended Cohn-Kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops (CVPRW), pp. 94–101 (2010)
Google Scholar
Zhao, G., Huang, X., Taini, M., et al.: Facial expression recognition from near-infrared videos. Image Vis. Comput. 29(9), 607–619 (2011)
Article Google Scholar
Hu, M., Wang, H., Wang, X., et al.: Video facial emotion recognition based on local enhanced motion history image and CNN-CTSLSTM networks. J. Vis. Commun. Image Represent. 59, 176–185 (2019)
Article Google Scholar
Liu, X., Jin, L., Han, X., et al.: Mutual information regularized identity-aware facial expression recognition in compressed video. Pattern Recogn. 119, 108105 (2021)
Article Google Scholar
Ding, H., Zhou, S.K., Chellappa, R.: Facenet2expnet: regularizing a deep face recognition net for expression recognition. In: 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG), pp. 118–126. IEEE (2017)
Google Scholar
Valstar, M., Pantic, M.: Induced disgust, happiness and surprise: an addition to the mmi facial expression database. In: Proceedings of 3rd International Workshop on EMOTION (Satellite of LREC): Corpora for Research on Emotion and Affect, p. 65 (2010)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Zhou, P., Shi, W., Tian, J., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL), vol. 2, pp. 207–212 (2016)
Google Scholar

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of Shandong Province (No. ZR2020LZH008, ZR2021MF118, ZR2019MF071), the Shandong Provincial Key Research and Development Program (Major Scientific and Technological Innovation Project) (NO. 2021CXGC010506, NO. 2021SFGC0104).

Author information

Authors and Affiliations

School of Information Science and Engineering, Shandong Normal University, Jinan, China
Xuanchi Chen, Heng Yang & Xiangwei Zheng
Shandong Provincial Key Laboratory for Distributed Computer Software Novel Technology, Jinan, China
Xuanchi Chen & Xiangwei Zheng
Internet Diagnosis and Treatment Center, Taian City Central Hospital, Taian, China
Xia Zhang
Shandong Normal University Library, Shandong Normal University, Jinan, China
Wei Li

Authors

Xuanchi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Heng Yang
View author publications
You can also search for this author in PubMed Google Scholar
Xia Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiangwei Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Wei Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiangwei Zheng or Wei Li .

Editor information

Editors and Affiliations

School of Computing and Informatics, University of Louisiana at Lafayette, Lafayette, IN, USA
Yuan Xu
Institute of Artificial Intelligence and Blockchain, Guangzhou University, Guangzhou, China
Hongyang Yan
Institute of Artificial Intelligence and Blockchain, Guangzhou University, Guangzhou, China
Huang Teng
Guangdong Polytechnic Normal University, Guangzhou, China
Jun Cai
Institute of Artificial Intelligence and Blockchain, Guangzhou University, Guangzhou, China
Jin Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X., Yang, H., Zhang, X., Zheng, X., Li, W. (2023). Deep Spatio-Temporal Decision Fusion Network for Facial Expression Recognition. In: Xu, Y., Yan, H., Teng, H., Cai, J., Li, J. (eds) Machine Learning for Cyber Security. ML4CS 2022. Lecture Notes in Computer Science, vol 13657. Springer, Cham. https://doi.org/10.1007/978-3-031-20102-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-20102-8_9
Published: 13 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20101-1
Online ISBN: 978-3-031-20102-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Deep Spatio-Temporal Decision Fusion Network for Facial Expression Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Mix Fusion Spatial-Temporal Network for Facial Expression Recognition

Facial Expression Recognition Based on Spatial-Temporal Fusion with Attention Mechanism

Deep convolutional BiLSTM fusion network for facial expression recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Deep Spatio-Temporal Decision Fusion Network for Facial Expression Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Mix Fusion Spatial-Temporal Network for Facial Expression Recognition

Facial Expression Recognition Based on Spatial-Temporal Fusion with Attention Mechanism

Deep convolutional BiLSTM fusion network for facial expression recognition

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.