Abstract
Efficient fusion on LiDAR-camera data for 3D object detection is a challenging task. Although RGB image provides sufficient texture and semantic features, some of them are unrelated to the targeted objects, which are useless and even misleading for detection task. In this paper, point semantic saliency (PSS) is proposed for precise fusion. In this scheme, physical receptive field (PRF) constraint is built to establish the relation of 2D and 3D receptive fields from camera projection model. To increase the saliency of the pixel from targeted object, we propose PSS to extract salient point feature with the guidance of RGB and semantic segmentation images, which provides 2D supplementary information for 3D detection. Comparison results and ablation studies demonstrate that PSS improves the detection performance in both localization and classification. Among the current single stage detectors, our method improves APs by \(0.62\%\) for hard level and mAP by \(0.31\%\).
This work is supported by the National Natural Science Foundation of China (61991412).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Chen, L., Zhu, Y., Papandreou, G., et al.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Proceedings of ECCV, pp. 833–851 (2018)
Chen, X., Ma, H., Wan, J., et al.: Multi-view 3D object detection network for autonomous driving. In: Proceedings of CVPR, pp. 1–10 (2017)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of CVPR, pp. 3354–3361 (2012)
Graham, B., Engelcke, M., van der Maaten, L.: 3D semantic segmentation with submanifold sparse convolutional networks. In: Proceedings of CVPR, pp. 9224–9232 (2018)
He, C., Zeng, H., Huang, J., et al.: Structure aware single-stage 3D object detection from point cloud. In: Proceedings of CVPR, pp. 11870–11879 (2020)
Ku, J., Mozifian, M., Lee, J., et al.: Joint 3D proposal generation and object detection from view aggregation. In: Proceedings of IROS, pp. 1–8 (2018)
Liang, M., Yang, B., Chen, Y., et al.: Multi-task multi-sensor fusion for 3D object detection. In: Proceedings of CVPR, pp. 7345–7353 (2019)
Liang, M., Yang, B., Wang, S., Urtasun, R.: Deep continuous fusion for multi-sensor 3D object detection. In: Proceedings of ECCV, pp. 663–678 (2018)
Lin, T., Goyal, P., Girshick, R.B., et al.: Focal loss for dense object detection. In: Proceedings of CVPR, pp. 2999–3007 (2017)
Liu, W., et al.: SSD: single shot MultiBox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Ma, X., Wang, Z., Li, H., et al.: Accurate monocular 3D object detection via color-embedded 3D reconstruction for autonomous driving. In: Proceedings of ICCV, pp. 6850–6859 (2019)
Meyer, G.P., Laddha, A., Kee, E., et al.: LaserNet: an efficient probabilistic 3D object detector for autonomous driving. In: Proceedings of CVPR, pp. 12677–12686 (2019)
Pang, S., Morris, D., Radha, H.: CLOCs: camera-LiDAR object candidates fusion for 3D object detection. In: Proceedings of IROS, pp. 1–8 (2020)
Qi, C.R., Liu, W., Wu, C., et al.: Frustum PointNets for 3D object detection from RGB-D data. In: Proceedings of CVPR, pp. 918–927 (2018)
Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: Proceedings of CVPR, pp. 77–85 (2017)
Qi, C.R., Yi, L., Su, H., Guibas, L.J.: PointNet++: deep hierarchical feature learning on point sets in a metric space. In: Proceedings of NIPS, pp. 5099–5108 (2017)
Shi, S., Guo, C., Jiang, L., et al.: PV-RCNN: point-voxel feature set abstraction for 3D object detection. In: Proceedings of CVPR, pp. 10526–10535 (2020)
Sindagi, V.A., Zhou, Y., Tuzel, O.: MVX-Net: multimodal VoxelNet for 3D object detection. In: Proceedings of ICRA, pp. 7276–7282 (2019)
Vora, S., Lang, A.H., Helou, B., Beijbom, O.: PointPainting: sequential fusion for 3D object detection. In: Proceedings of CVPR, pp. 4603–4611 (2020)
Xie, L., Xiang, C., Yu, Z., et al.: PI-RCNN: an efficient multi-sensor 3D object detector with point-based attentive Cont-conv fusion module. In: Proceedings of AAAI, pp. 12460–12467 (2020)
Xu, D., Anguelov, D., Jain, A.: PointFusion: deep sensor fusion for 3D bounding box estimation. In: Proceedings of CVPR, pp. 244–253 (2018)
Yang, B., Luo, W., Urtasun, R.: PIXOR: real-time 3D object detection from point clouds. In: Proceedings of CVPR, pp. 7652–7660 (2018)
Yang, Z., Sun, Y., Liu, S., et al.: IPOD: intensive point-based object detector for point cloud. CoRR arXiv:1812.05276 (2018)
Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_53
Zhang, Z.: A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000)
Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of CVPR, pp. 4490–4499 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Cen, J., An, P., Chen, G., Liang, J., Ma, J. (2021). PSS: Point Semantic Saliency for 3D Object Detection. In: Fang, L., Chen, Y., Zhai, G., Wang, J., Wang, R., Dong, W. (eds) Artificial Intelligence. CICAI 2021. Lecture Notes in Computer Science(), vol 13069. Springer, Cham. https://doi.org/10.1007/978-3-030-93046-2_35
Download citation
DOI: https://doi.org/10.1007/978-3-030-93046-2_35
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93045-5
Online ISBN: 978-3-030-93046-2
eBook Packages: Computer ScienceComputer Science (R0)