Skip to main content

PSS: Point Semantic Saliency for 3D Object Detection

  • Conference paper
  • First Online:
Artificial Intelligence (CICAI 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13069))

Included in the following conference series:

  • 2256 Accesses

Abstract

Efficient fusion on LiDAR-camera data for 3D object detection is a challenging task. Although RGB image provides sufficient texture and semantic features, some of them are unrelated to the targeted objects, which are useless and even misleading for detection task. In this paper, point semantic saliency (PSS) is proposed for precise fusion. In this scheme, physical receptive field (PRF) constraint is built to establish the relation of 2D and 3D receptive fields from camera projection model. To increase the saliency of the pixel from targeted object, we propose PSS to extract salient point feature with the guidance of RGB and semantic segmentation images, which provides 2D supplementary information for 3D detection. Comparison results and ablation studies demonstrate that PSS improves the detection performance in both localization and classification. Among the current single stage detectors, our method improves APs by \(0.62\%\) for hard level and mAP by \(0.31\%\).

This work is supported by the National Natural Science Foundation of China (61991412).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Chen, L., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of ECCV, pp. 833–851 (2018)

    Google Scholar 

  2. Chen, X., Ma, H., Wan, J., et al.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of CVPR, pp. 1–10 (2017)

    Google Scholar 

  3. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of CVPR, pp. 3354–3361 (2012)

    Google Scholar 

  4. Graham, B., Engelcke, M., van der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of CVPR, pp. 9224–9232 (2018)

    Google Scholar 

  5. He, C., Zeng, H., Huang, J., et al.: Structure aware single-stage 3D object detection from point cloud. In: Proceedings of CVPR, pp. 11870–11879 (2020)

    Google Scholar 

  6. Ku, J., Mozifian, M., Lee, J., et al.: Joint 3D proposal generation and object detection from view aggregation. In: Proceedings of IROS, pp. 1–8 (2018)

    Google Scholar 

  7. Liang, M., Yang, B., Chen, Y., et al.: Multi-task multi-sensor fusion for 3D object detection. In: Proceedings of CVPR, pp. 7345–7353 (2019)

    Google Scholar 

  8. Liang, M., Yang, B., Wang, S., Urtasun, R.: Deep continuous fusion for multi-sensor 3D object detection. In: Proceedings of ECCV, pp. 663–678 (2018)

    Google Scholar 

  9. Lin, T., Goyal, P., Girshick, R.B., et al.: Focal loss for dense object detection. In: Proceedings of CVPR, pp. 2999–3007 (2017)

    Google Scholar 

  10. Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

  11. Ma, X., Wang, Z., Li, H., et al.: Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving. In: Proceedings of ICCV, pp. 6850–6859 (2019)

    Google Scholar 

  12. Meyer, G.P., Laddha, A., Kee, E., et al.: LaserNet: an efficient probabilistic 3D object detector for autonomous driving. In: Proceedings of CVPR, pp. 12677–12686 (2019)

    Google Scholar 

  13. Pang, S., Morris, D., Radha, H.: CLOCs: camera-LiDAR object candidates fusion for 3D object detection. In: Proceedings of IROS, pp. 1–8 (2020)

    Google Scholar 

  14. Qi, C.R., Liu, W., Wu, C., et al.: Frustum PointNets for 3D object detection from RGB-D data. In: Proceedings of CVPR, pp. 918–927 (2018)

    Google Scholar 

  15. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of CVPR, pp. 77–85 (2017)

    Google Scholar 

  16. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of NIPS, pp. 5099–5108 (2017)

    Google Scholar 

  17. Shi, S., Guo, C., Jiang, L., et al.: PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: Proceedings of CVPR, pp. 10526–10535 (2020)

    Google Scholar 

  18. Sindagi, V.A., Zhou, Y., Tuzel, O.: MVX-Net: multimodal VoxelNet for 3D object detection. In: Proceedings of ICRA, pp. 7276–7282 (2019)

    Google Scholar 

  19. Vora, S., Lang, A.H., Helou, B., Beijbom, O.: PointPainting: sequential fusion for 3D object detection. In: Proceedings of CVPR, pp. 4603–4611 (2020)

    Google Scholar 

  20. Xie, L., Xiang, C., Yu, Z., et al.: PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive Cont-conv fusion module. In: Proceedings of AAAI, pp. 12460–12467 (2020)

    Google Scholar 

  21. Xu, D., Anguelov, D., Jain, A.: PointFusion: deep sensor fusion for 3D bounding box estimation. In: Proceedings of CVPR, pp. 244–253 (2018)

    Google Scholar 

  22. Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: Proceedings of CVPR, pp. 7652–7660 (2018)

    Google Scholar 

  23. Yang, Z., Sun, Y., Liu, S., et al.: IPOD: intensive point-based object detector for point cloud. CoRR arXiv:1812.05276 (2018)

  24. Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53

    Chapter  Google Scholar 

  25. Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)

    Article  Google Scholar 

  26. Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of CVPR, pp. 4490–4499 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cen, J., An, P., Chen, G., Liang, J., Ma, J. (2021). PSS: Point Semantic Saliency for 3D Object Detection. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13069. Springer, Cham. https://doi.org/10.1007/978-3-030-93046-2_35

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93046-2_35

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93045-5

  • Online ISBN: 978-3-030-93046-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy