Skip to main content

HAPNet: a head-aware pedestrian detection network associated with the affinity field

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Pedestrian detection has made great progress with the rapid development of deep learning, but pedestrian detection under occlusion remains a challenge. To solve the occlusion problem, some research endeavors have been carried out based on visible parts, such as the body, but there is still substantial room for improvement due to the introduction of additional calculation. Aiming to improve the detection performance under occlusion, this paper proposes a head-aware pedestrian detection network (HAPNet) by using the inherent structural relationship between the human body and head. The postprocessing stage is redesigned to include a scoring module and an augmented non-maximum suppression (NMS) algorithm. Specifically, HAPNet detects head and body simultaneously through different layers of a feature map. We then propose a head-side affinity model, which can represent the association between the head and body sides. Detection and affinity prediction tasks are implemented through different branches of HAPNet. In the scoring module, the head and body detection scores are fused to match the head-body pairs to improve the detection performance. On this basis, an enhanced NMS algorithm is proposed, which achieves a good balance between reducing false positives and missing detection. The experimental results verify the effectiveness of this method in pedestrian detection under occlusion.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Tian Y, Luo P, Wang X, et al. Deep learning strong parts for pedestrian detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015

  2. Zhou C, Yuan J. Bi-box regression for pedestrian detection and occlusion estimation. In: Proceedings of European Conference on Computer Vision, 2018

  3. Cai Z, Fan Q, Feris R, et al. A unified multi-scale deep convolutional neural network for fast object detection. In: Proceedings of European Conference on Computer Vision, 2016

  4. Mao J, Xiao T, Jiang Y, et al. What can help pedestrian detection? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017

  5. Xie J, Pang Y W, Cholakkal H, et al. PSC-Net: learning part spatial co-occurrence for occluded pedestrian detection. Sci China Inf Sci, 2021, 64: 120103

    Article  MathSciNet  Google Scholar 

  6. Wang X, Xiao T, Jiang Y, et al. Repulsion loss: detecting pedestrians in a crowd. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  7. Zhang S, Wen L, Bian X, et al. Occlusion-aware R-CNN: detecting pedestrians in a crowd. In: Proceedings of European Conference on Computer Vision, 2018

  8. Huang X, Ge Z, Jie Z, et al. NMS by representative region: towards crowded pedestrian detection by proposal pairing. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020

  9. Shao S, Zhao Z, Li B, et al. Crowdhuman: a benchmark for detecting human in a crowd. 2018. ArXiv:1805.00123

  10. Dalal N, Triggs B. Histograms of oriented gradients for human detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2005

  11. Dollár P, Tu Z, Perona P, et al. Integral channel features. In: Proceedings of British Machine Vision Conference, 2009

  12. Zhou C, Yuan J. Multi-label learning of part detectors for heavily occluded pedestrian detection. In: Proceedings of IEEE International Conference on Computer Vision, 2017

  13. Liu T, Duan H B, Shang Y Y, et al. Automatic salient object sequence rebuilding for video segment analysis. Sci China Inf Sci, 2018, 61: 012205

    Article  MathSciNet  Google Scholar 

  14. Ren S, He K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of Conference and Workshop on Neural Information Processing Systems, 2015

  15. Zhang L, Liang L, Liang X, et al. Is faster R-CNN doing well for pedestrian detection? In: Proceedings of European Conference on Computer Vision, 2016

  16. Wu J, Zhou C, Yang M, et al. Temporal-context enhanced detection of heavily occluded pedestrians. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020

  17. Zhang S, Benenson R, Schiele B. CityPersons: a diverse dataset for pedestrian detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017

  18. Ma S, Pang Y W, Pan J, et al. Preserving details in semantics-aware context for scene parsing. Sci China Inf Sci, 2020, 63: 120106

    Article  Google Scholar 

  19. Sun H Q, Pang Y W. GlanceNets—efficient convolutional neural networks with adaptive hard example mining. Sci China Inf Sci, 2018, 61: 109101

    Article  Google Scholar 

  20. Zhang S, Yang J, Schiele B. Occluded pedestrian detection through guided attention in CNNs. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  21. Brazil G, Liu X. Pedestrian detection with autoregressive network phases. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019

  22. Noh J, Lee S, Kim B, et al. Improving occlusion and hard negative handling for single-stage pedestrian detectors. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  23. Liu W, Liao S, Hu W, et al. Learning efficient single-stage pedestrian detectors by asymptotic localization fitting. In: Proceedings of European Conference on Computer Vision, 2018

  24. Lin C, Lu J, Wang G, et al. Graininess-aware deep feature learning for pedestrian detection. In: Proceedings of European Conference on Computer Vision, 2018

  25. Liu W, Liao S, Ren W, et al. High-level semantic feature detection: a new perspective for pedestrian detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019

  26. Redmon J, Farhadi A. YOLO9000: better, faster, stronger. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017

  27. Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector. In: Proceedings of European Conference on Computer Vision, 2016

  28. Fu C, Liu W, Ranga A, et al. DSSD: deconvolutional single shot detector. 2017. ArXiv:1701.0665

  29. Song X, Zhao K, Chu W, et al. Progressive refinement network for occluded pedestrian detection. In: Proceedings of European Conference on Computer Vision, 2020

  30. Cao J, Pang Y, Han J, et al. Taking a look at small-scale pedestrians and occluded pedestrians. IEEE Trans Image Process, 2020, 29: 3143–3152

    Article  MATH  Google Scholar 

  31. Lin T, Dollár P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017

  32. Li J, Liang X, Shen S M, et al. Scale-aware fast R-CNN for pedestrian detection. IEEE Trans Multimedia, 2018, 20: 985–996

    Google Scholar 

  33. Li J, Liao S, Jiang H, et al. Box guided convolution for pedestrian detection. In: Proceedings of the 28th ACM International Conference on Multimedia, 2020

  34. Pang Y, Xie J, Khan M, et al. Mask-guided attention network for occluded pedestrian detection. In: Proceedings of IEEE International Conference on Computer Vision, 2019

  35. Zhou C, Yang M, Yuan J. Discriminative feature transformation for occluded pedestrian detection. In: Proceedings of IEEE International Conference on Computer Vision, 2019

  36. Zhao Y, Yuan Z, Zhang H. Joint holistic and partial CNN for pedestrian detection. In: Proceedings of British Machine Vision Conference, 2018

  37. Chi C, Zhang S, Xing J, et al. Pedhunter: occlusion robust pedestrian detector in crowded scenes. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020

  38. Chi C, Zhang S, Xing J, et al. Relational learning for joint head and human detection. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020

  39. Chen G, Cai X, Han H, et al. HeadNet: pedestrian head detection utilizing body in context. In: Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition, 2018

  40. Lin T, Goyal P, Girshick R, et al. Focal loss for dense object detection. In: Proceedings of IEEE International Conference on Computer Vision, 2017

  41. Liu S, Huang D, Wang Y. Adaptive NMS: refining pedestrian detection in a crowd. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2019

  42. Chu X, Zheng A, Zhang X, et al. Detection in crowded scenes: one proposal, multiple predictions. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2020

  43. Zhao Y, Yuan Z, Chen B. Training cascade compact CNN with region-IoU for accurate pedestrian detection. IEEE Trans Intell Transp Syst, 2020, 21: 3777–3787

    Article  Google Scholar 

  44. Bodla N, Singh B, Chellappa R, et al. Soft-NMS-improving object detection with one line of code. In: Proceedings of European Conference on Computer Vision, 2017

  45. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. ArXiv:1409.1556

  46. Cao Z, Simon T, Wei S E, et al. Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2017

  47. Girshick R. Fast R-CNN. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2015

  48. Cai Z, Vasconcelos N. Cascade R-CNN: delving into high quality object detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2018

  49. Brazil G, Yin X, Liu X. Illuminating pedestrians via simultaneous detection and segmentation. In: Proceedings of IEEE International Conference on Computer Vision, 2017

  50. Cordts M, Omran M, Ramos S, et al. The cityscapes dataset for semantic urban scene understanding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016

  51. Zhang S, Benenson R, Omran M, et al. How far are we from solving pedestrian detection? In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016

  52. Dollar P, Wojek C, Schiele B, et al. Pedestrian detection: an evaluation of the state of the art. IEEE Trans Pattern Anal Mach Intell, 2012, 34: 743–761

    Article  Google Scholar 

  53. He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 2016

Download references

Acknowledgements

This work was supported by the Beijing Natural Science Foundation (Grant No. L201022) and in part by the National Natural Science Foundation of China (Grant Nos. 61876112, 61976170).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Tie Liu or Zejian Yuan.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ding, J., Liu, T., Zhao, Y. et al. HAPNet: a head-aware pedestrian detection network associated with the affinity field. Sci. China Inf. Sci. 65, 160102 (2022). https://doi.org/10.1007/s11432-021-3300-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-021-3300-2

Keywords

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy