HAPNet: a head-aware pedestrian detection network associated with the affinity field

Ding, Jiali; Liu, Tie; Zhao, Yun; Yuan, Zejian; Shang, Yuanyuan

doi:10.1007/s11432-021-3300-2

HAPNet: a head-aware pedestrian detection network associated with the affinity field

Research Paper
Published: 07 April 2022

Volume 65, article number 160102, (2022)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

204 Accesses
4 Citations
Explore all metrics

Abstract

Pedestrian detection has made great progress with the rapid development of deep learning, but pedestrian detection under occlusion remains a challenge. To solve the occlusion problem, some research endeavors have been carried out based on visible parts, such as the body, but there is still substantial room for improvement due to the introduction of additional calculation. Aiming to improve the detection performance under occlusion, this paper proposes a head-aware pedestrian detection network (HAPNet) by using the inherent structural relationship between the human body and head. The postprocessing stage is redesigned to include a scoring module and an augmented non-maximum suppression (NMS) algorithm. Specifically, HAPNet detects head and body simultaneously through different layers of a feature map. We then propose a head-side affinity model, which can represent the association between the head and body sides. Detection and affinity prediction tasks are implemented through different branches of HAPNet. In the scoring module, the head and body detection scores are fused to match the head-body pairs to improve the detection performance. On this basis, an enhanced NMS algorithm is proposed, which achieves a good balance between reducing false positives and missing detection. The experimental results verify the effectiveness of this method in pedestrian detection under occlusion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Handling Heavy Occlusion in Dense Crowd Tracking by Focusing on the Heads

Fast and Fusion: Real-Time Pedestrian Detector Boosted by Body-Head Fusion

Global Context-Aware Feature Extraction and Visible Feature Enhancement for Occlusion-Invariant Pedestrian Detection in Crowded Scenes

Article 13 June 2022

References

Tian Y, Luo P, Wang X, et al. Deep learning strong parts for pedestrian detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015
Zhou C, Yuan J. Bi-box regression for pedestrian detection and occlusion estimation. In: Proceedings of European Conference on Computer Vision, 2018
Cai Z, Fan Q, Feris R, et al. A unified multi-scale deep convolutional neural network for fast object detection. In: Proceedings of European Conference on Computer Vision, 2016
Mao J, Xiao T, Jiang Y, et al. What can help pedestrian detection? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017
Xie J, Pang Y W, Cholakkal H, et al. PSC-Net: learning part spatial co-occurrence for occluded pedestrian detection. Sci China Inf Sci, 2021, 64: 120103
Article MathSciNet Google Scholar
Wang X, Xiao T, Jiang Y, et al. Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Zhang S, Wen L, Bian X, et al. Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Proceedings of European Conference on Computer Vision, 2018
Huang X, Ge Z, Jie Z, et al. NMS by representative region: towards crowded pedestrian detection by proposal pairing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020
Shao S, Zhao Z, Li B, et al. Crowdhuman: a benchmark for detecting human in a crowd. 2018. ArXiv:1805.00123
Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2005
Dollár P, Tu Z, Perona P, et al. Integral channel features. In: Proceedings of British Machine Vision Conference, 2009
Zhou C, Yuan J. Multi-label learning of part detectors for heavily occluded pedestrian detection. In: Proceedings of IEEE International Conference on Computer Vision, 2017
Liu T, Duan H B, Shang Y Y, et al. Automatic salient object sequence rebuilding for video segment analysis. Sci China Inf Sci, 2018, 61: 012205
Article MathSciNet Google Scholar
Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Conference and Workshop on Neural Information Processing Systems, 2015
Zhang L, Liang L, Liang X, et al. Is faster R-CNN doing well for pedestrian detection? In: Proceedings of European Conference on Computer Vision, 2016
Wu J, Zhou C, Yang M, et al. Temporal-context enhanced detection of heavily occluded pedestrians. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020
Zhang S, Benenson R, Schiele B. CityPersons: a diverse dataset for pedestrian detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017
Ma S, Pang Y W, Pan J, et al. Preserving details in semantics-aware context for scene parsing. Sci China Inf Sci, 2020, 63: 120106
Article Google Scholar
Sun H Q, Pang Y W. GlanceNets—efficient convolutional neural networks with adaptive hard example mining. Sci China Inf Sci, 2018, 61: 109101
Article Google Scholar
Zhang S, Yang J, Schiele B. Occluded pedestrian detection through guided attention in CNNs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Brazil G, Liu X. Pedestrian detection with autoregressive network phases. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019
Noh J, Lee S, Kim B, et al. Improving occlusion and hard negative handling for single-stage pedestrian detectors. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Liu W, Liao S, Hu W, et al. Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of European Conference on Computer Vision, 2018
Lin C, Lu J, Wang G, et al. Graininess-aware deep feature learning for pedestrian detection. In: Proceedings of European Conference on Computer Vision, 2018
Liu W, Liao S, Ren W, et al. High-level semantic feature detection: a new perspective for pedestrian detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019
Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017
Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, 2016
Fu C, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector. 2017. ArXiv:1701.0665
Song X, Zhao K, Chu W, et al. Progressive refinement network for occluded pedestrian detection. In: Proceedings of European Conference on Computer Vision, 2020
Cao J, Pang Y, Han J, et al. Taking a look at small-scale pedestrians and occluded pedestrians. IEEE Trans Image Process, 2020, 29: 3143–3152
Article MATH Google Scholar
Lin T, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017
Li J, Liang X, Shen S M, et al. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans Multimedia, 2018, 20: 985–996
Google Scholar
Li J, Liao S, Jiang H, et al. Box guided convolution for pedestrian detection. In: Proceedings of the 28th ACM International Conference on Multimedia, 2020
Pang Y, Xie J, Khan M, et al. Mask-guided attention network for occluded pedestrian detection. In: Proceedings of IEEE International Conference on Computer Vision, 2019
Zhou C, Yang M, Yuan J. Discriminative feature transformation for occluded pedestrian detection. In: Proceedings of IEEE International Conference on Computer Vision, 2019
Zhao Y, Yuan Z, Zhang H. Joint holistic and partial CNN for pedestrian detection. In: Proceedings of British Machine Vision Conference, 2018
Chi C, Zhang S, Xing J, et al. Pedhunter: occlusion robust pedestrian detector in crowded scenes. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020
Chi C, Zhang S, Xing J, et al. Relational learning for joint head and human detection. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020
Chen G, Cai X, Han H, et al. HeadNet: pedestrian head detection utilizing body in context. In: Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition, 2018
Lin T, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of IEEE International Conference on Computer Vision, 2017
Liu S, Huang D, Wang Y. Adaptive NMS: refining pedestrian detection in a crowd. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019
Chu X, Zheng A, Zhang X, et al. Detection in crowded scenes: one proposal, multiple predictions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020
Zhao Y, Yuan Z, Chen B. Training cascade compact CNN with region-IoU for accurate pedestrian detection. IEEE Trans Intell Transp Syst, 2020, 21: 3777–3787
Article Google Scholar
Bodla N, Singh B, Chellappa R, et al. Soft-NMS-improving object detection with one line of code. In: Proceedings of European Conference on Computer Vision, 2017
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556
Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017
Girshick R. Fast R-CNN. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015
Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018
Brazil G, Yin X, Liu X. Illuminating pedestrians via simultaneous detection and segmentation. In: Proceedings of IEEE International Conference on Computer Vision, 2017
Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016
Zhang S, Benenson R, Omran M, et al. How far are we from solving pedestrian detection? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016
Dollar P, Wojek C, Schiele B, et al. Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell, 2012, 34: 743–761
Article Google Scholar
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016

Download references

Acknowledgements

This work was supported by the Beijing Natural Science Foundation (Grant No. L201022) and in part by the National Natural Science Foundation of China (Grant Nos. 61876112, 61976170).

Author information

Authors and Affiliations

College of Information Engineering, Capital Normal University, Beijing, 100048, China
Jiali Ding, Tie Liu & Yuanyuan Shang
College of Artificial Intelligence, Xi’an Jiaotong University, Xi’an, 710049, China
Yun Zhao & Zejian Yuan

Authors

Jiali Ding
View author publications
You can also search for this author in PubMed Google Scholar
Tie Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yun Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Zejian Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Yuanyuan Shang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Tie Liu or Zejian Yuan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ding, J., Liu, T., Zhao, Y. et al. HAPNet: a head-aware pedestrian detection network associated with the affinity field. Sci. China Inf. Sci. 65, 160102 (2022). https://doi.org/10.1007/s11432-021-3300-2

Download citation

Received: 25 March 2021
Revised: 20 May 2021
Accepted: 30 June 2021
Published: 07 April 2022
DOI: https://doi.org/10.1007/s11432-021-3300-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HAPNet: a head-aware pedestrian detection network associated with the affinity field

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Handling Heavy Occlusion in Dense Crowd Tracking by Focusing on the Heads

Fast and Fusion: Real-Time Pedestrian Detector Boosted by Body-Head Fusion

Global Context-Aware Feature Extraction and Visible Feature Enhancement for Occlusion-Invariant Pedestrian Detection in Crowded Scenes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

HAPNet: a head-aware pedestrian detection network associated with the affinity field

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Handling Heavy Occlusion in Dense Crowd Tracking by Focusing on the Heads

Fast and Fusion: Real-Time Pedestrian Detector Boosted by Body-Head Fusion

Global Context-Aware Feature Extraction and Visible Feature Enhancement for Occlusion-Invariant Pedestrian Detection in Crowded Scenes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.