Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

Song, Lin; Yang, Jin-Fu; Shang, Qing-Zhen; Li, Ming-Ai

doi:10.1007/s11633-022-1327-2

Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

Research Article
Published: 29 March 2022

Volume 19, pages 247–256, (2022)
Cite this article

Machine Intelligence Research Aims and scope Submit manuscript

Lin Song ORCID: orcid.org/0000-0003-2289-7325¹,
Jin-Fu Yang^1,2,
Qing-Zhen Shang¹ &
…
Ming-Ai Li¹

202 Accesses
1 Altmetric
Explore all metrics

Abstract

Face detection has achieved tremendous strides thanks to convolutional neural networks. However, dense face detection remains an open challenge due to large face scale variation, tiny faces, and serious occlusion. This paper presents a robust, dense face detector using global context and visual attention mechanisms which can significantly improve detection accuracy. Specifically, a global context fusion module with top-down feedback is proposed to improve the ability to identify tiny faces. Moreover, a visual attention mechanism is employed to solve the problem of occlusion. Experimental results on the public face datasets WIDER FACE and FDDB demonstrate the effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

EfficientFace: an efficient deep network with feature enhancement for accurate face detection

Article 14 July 2023

Robust Face Detector with Fully Convolutional Networks

HPFace: a high speed and accuracy face detector

Article 24 September 2022

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

F. Schroff, D. Kalenichenko, J. Philbin. FaceNet: A unified embedding for face recognition and clustering. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 815–823, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298682.
Google Scholar
F. F. Zhang, T. Z. Zhang, Q. R. Mao, C. S. Xu. Joint pose and expression modeling for facial expression recognition. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, Salt Lake City, USA, pp. 3359–3368, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00354.
Google Scholar
K. P. Zhang, Z. P. Zhang, Z. F. Li, Y. Qiao. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, vol. 23, no. 10, pp. 1499–1503, 2016. DOI: https://doi.org/10.1109/LSP.2016.2603342.
Article Google Scholar
P. Y. Hu, D. Ramanan. Finding tiny faces. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Honolulu, USA, pp. 1522–1530, 2017. DOI: https://doi.org/10.1109/CVPR.2017.166.
Google Scholar
M. Najibi, P. Samangouei, R. Chellappa, L. S. Davis. SSH: Single stage headless face detector. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 4885–4894, 2017. DOI: https://doi.org/10.1109/ICCV.2017.522.
Google Scholar
J. K. Deng, J. Guo, Y. X. Zhou, J. K. Yu, I. Kotsia, S. Zafeiriou. RetinaFace: Single-stage dense face localisation in the wild. [Online], Available: https://arxiv.org/abs/1905.00641, 2019.
T. Y. Lin, P. Goyal, R. Girshick, K. M. He, P. Dollár. Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318–327, 2020. DOI: https://doi.org/10.1109/TPAMI.2018.2858826.
Article Google Scholar
V. Jain, E. Learned-Miller. FDDB: A Benchmark for Face Detection in Unconstrained Settings, Technical Report UM-CS-2010–009, University of Massachusetts, USA, 2010.
Google Scholar
S. Yang, P. Luo, C. C. Loy, X. O. Tang. WIDER FACE: A face detection benchmark. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 5525–5533, 2016. DOI: https://doi.org/10.1109/CV-PR.2016.596.
Google Scholar
C. C. Zhu, Y. T. Zheng, K. Luu, M. Savvides. CMS-RCNN: Contextual multi-scale region-based CNN for unconstrained face detection. Deep Learning for Biometrics, B. Bhanu, A. Kumar, Eds., Cham, Germany: Springer, pp. 57–79, 2017. DOI: https://doi.org/10.1007/978-3-319-61657-5_3.
Chapter Google Scholar
S. Q. Ren, K. M. He, R. Girshick, J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 91–99, 2015. DOI: https://doi.org/10.5555/2969239.2969250.
T. Xu, D. K. Du, Z. Q. He, J. T. Liu. PyramidBox: A context-assisted single shot face detector. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 812–828, 2018. DOI: https://doi.org/10.1007/978-3-030-01240-3_49.
Google Scholar
S. F. Zhang, X. Y. Zhu, Z. Lei, H. L. Shi, X. B. Wang, S. Z. Li. S.3FD: Single shot scale-invariant face detector. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Venice, Italy, pp. 192–201, 2017. DOI: https://doi.org/10.1109/ICCV.2017.30.
Google Scholar
M. Jaderberg, K. Simonyan, A. Zisserman, K. Kavukcuoglu. Spatial transformer networks. In Proceedings of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2017–2025, 2015. DOI: https://doi.org/10.5555/2969442.2969465.
J. Hu, L. Shen, S. Albanie, G. Sun, E. H. Wu. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 8, pp. 2011–2023, 2020. DOI: https://doi.org/10.1109/TPAMI.2019.2913372.
Article Google Scholar
S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Springer, Munich, Germany, pp. 3–19, 2018. DOI: https://doi.org/10.1007/978-3-030-01234-21.
Google Scholar
J. F. Wang, Y. Yuan, G. Yu. Face attention network: An effective face detector for the occluded faces. [Online], Available: https://arxiv.org/abs/1711.07246, 2017.
A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, M. Andreetto, H. Adam. MobileNets: Efficient convolutional neural networks for mobile vision applications. [Online], Available: https://arxiv.org/abs/1704.04861, 2017.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
Google Scholar
S. S. Farfade, M. J. Saberian, L. J. Li. Multi-view face detection using deep convolutional neural networks. In Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ACM, Shanghai, China, pp. 643–650, 2015. DOI: https://doi.org/10.1145/2671188.2749408.
Google Scholar
H. X. Li, Z. Lin, X. H. Shen, J. Brandt, G. Hua. A convolutional neural network cascade for face detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE, Boston, USA, pp. 5325–5334, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7299170.
Google Scholar
R. Girshick. Fast R-CNN. In Proceedings of IEEE International Conference on Computer Vision, IEEE, Santiago, Chile, pp. 1440–1448, 2015. DOI: https://doi.org/10.1109/ICCV.2015.169.
Google Scholar
R. Ranjan, V. M. Patel, R. Chellappa. A deep pyramid deformable part model for face detection. In Proceedings of the 7th IEEE International Conference on Biometrics Theory, Applications and Systems, IEEE, Arlington, USA, pp. 1–8, 2015. DOI: https://doi.org/10.1109/BTAS.2015.7358755.
Google Scholar
J. H. Yu, Y. N. Jiang, Z. Y. Wang, Z. M. Cao, T. Huang. UnitBox: An advanced object detection network. In Proceedings of the 24th ACM International Conference on Multimedia, ACM, Amsterdam, The Netherlands, pp. 516–520, 2016. DOI: https://doi.org/10.1145/2964284.2967274.
Google Scholar

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61973009).

Author information

Authors and Affiliations

Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
Lin Song, Jin-Fu Yang, Qing-Zhen Shang & Ming-Ai Li
Beijing Key Laboratory of Computational Intelligence and Intelligent Systems, Beijing, 100124, China
Jin-Fu Yang

Authors

Lin Song
View author publications
You can also search for this author inPubMed Google Scholar
Jin-Fu Yang
View author publications
You can also search for this author inPubMed Google Scholar
Qing-Zhen Shang
View author publications
You can also search for this author inPubMed Google Scholar
Ming-Ai Li
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Lin Song.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Lin Song received the B.Sc. degree in measurement and control technology and instrumentation from YanTai University, China in 2019. She is now a master student with Department of Control Science and Engineering, Beijing University of Technology, China.

Her research interests include deep learning and computer vision.

Jin-Fu Yang received the Ph.D. degree in pattern recognition and intelligent systems from National Laboratory of Pattern Recognition, Chinese Academy of Sciences, China in 2006. He is now a professor with Faculty of Information Technology and Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, China.

His research interests include pattern recognition, computer vision and robot navigation.

Qing-Zhen Shang received the M. Eng. degree in mathematics from Hebei University, China in 2017. She is a Ph.D. degree candidate at Department of Control Science and Engineering, Beijing University of Technology, China.

Her research interests include deep learning and computer vision.

Ming-Ai Li received the Ph.D. degree from Beijing University of Technology, China in 2006. She is now a professor with Faculty of Information Technology and Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing University of Technology, China.

Her research interests include brain-computer interface, intelligent control, pattern recognition and implementation of autonomous learning control technology for flexible two-wheeled upstanding robots.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, L., Yang, JF., Shang, QZ. et al. Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism. Mach. Intell. Res. 19, 247–256 (2022). https://doi.org/10.1007/s11633-022-1327-2

Download citation

Received: 30 August 2021
Accepted: 01 March 2022
Published: 29 March 2022
Issue Date: June 2022
DOI: https://doi.org/10.1007/s11633-022-1327-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

EfficientFace: an efficient deep network with feature enhancement for accurate face detection

Robust Face Detector with Fully Convolutional Networks

HPFace: a high speed and accuracy face detector

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Dense Face Network: A Dense Face Detector Based on Global Context and Visual Attention Mechanism

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

EfficientFace: an efficient deep network with feature enhancement for accurate face detection

Robust Face Detector with Fully Convolutional Networks

HPFace: a high speed and accuracy face detector

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.