A 3D Reconstruction Network Based on Multi-sensor

Zhou, Yuwen; Lv, Jianhao; Ma, Yaofei; Ma, Xiaole

doi:10.1007/978-981-19-9198-1_44

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1712))

Included in the following conference series:

Asian Simulation Conference

1079 Accesses

Abstract

To reconstruct the 3D model of specific targets in real time, a multi-sensor data fusion-based 3D reconstruction algorithm is proposed in this paper. This network-based algorithm takes the camera image and lidar point-cloud data as inputs, employing RGB channel and lidar channel to process each type of data separately, and finally obtains the targets’ dense depth map by fusion. In RGB channel, the transformer network rather than CNN (convolutional neural network) is used to obtain multi-scale image features with global receptive field and high resolution, and generate monocular depth, guidance map and semantic segmentation. In the lidar channel, the sparse lidar data is fused with the guidance map to generate the final prediction of dense depth. In the test, our algorithm achieved a high ranking on the leaderboard. In application, under the condition of equal reconstruction quality, a five times faster speed is obtained in 3D reconstruction comparing to the traditional image-based method.

Y. Ma(1981)—-Associated professor of BeiHang University, researching in fields of M&S theory and practice, and intelligent behavior modeling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Multi-modal Characteristic Guided Depth Completion Network

LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing

A deep learning-based framework for efficient and accurate 3D real-scene reconstruction

Article 18 July 2024

References

Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Devlin, J., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Long, J., Evan, S., Trevor, D.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015)
Google Scholar
Ronneberger, O., Philipp, F., Thomas, B.: U-net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds.) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science, vol. 9351. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Cheng, X., Wang, P., Yang, R.: Depth estimation via affinity learned with convolutional spatial propagation network. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11220, pp. 108–125. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01270-0_7
Chapter Google Scholar
Tang, J., Folkesson, J., Jensfelt, P.: Sparse2dense: From direct sparse odometry to dense 3-d reconstruction. IEEE Robot. Autom. Letters 4(2), 530–537 (2019)
Article Google Scholar
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481–2495 (2017)
Article Google Scholar
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., Adam, H.: Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 833–851. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_49
Chapter Google Scholar
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Ranftl, R., Alexey, B., Vladlen, K.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar
Ma, F., Guilherme, V.C., Sertac, K.: Self-supervised sparse-to-dense: Self-supervised depth completion from Lidar and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA). IEEE (2019)
Google Scholar
He, K., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Zhang, Y., Thomas, F.: Deep depth completion of a single RGB-D image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Valada, A., Mohan, R., Burgard, W.: Self-supervised model adaptation for multimodal semantic segmentation. Int. J. Comput. Vision 128(5), 1239–1285 (2020)
Article MATH Google Scholar
Lin, G., et al.: RefineNet: multi-path refinement networks for high-resolution semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
de Brébisson, A., Pascal, V.: The z-loss: a shift and scale invariant classification loss belonging to the spherical family. arXiv preprint arXiv:1604.08859 (2016)

Download references

Author information

Authors and Affiliations

BeiHang University, Beijing, 100191, China
Yuwen Zhou, Jianhao Lv, Yaofei Ma & Xiaole Ma
CASIC Research Institute of Intelligent Decision Engineering, Beijing, 100074, China
Yuwen Zhou, Jianhao Lv, Yaofei Ma & Xiaole Ma

Authors

Yuwen Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jianhao Lv
View author publications
You can also search for this author in PubMed Google Scholar
Yaofei Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xiaole Ma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yaofei Ma .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Wenhui Fan
Beihang University, Beijing, China
Lin Zhang
Beihang University, Beijing, China
Ni Li
Beihang University, Beijing, China
Xiao Song

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Y., Lv, J., Ma, Y., Ma, X. (2022). A 3D Reconstruction Network Based on Multi-sensor. In: Fan, W., Zhang, L., Li, N., Song, X. (eds) Methods and Applications for Modeling and Simulation of Complex Systems. AsiaSim 2022. Communications in Computer and Information Science, vol 1712. Springer, Singapore. https://doi.org/10.1007/978-981-19-9198-1_44

Download citation

DOI: https://doi.org/10.1007/978-981-19-9198-1_44
Published: 23 December 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-9197-4
Online ISBN: 978-981-19-9198-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A 3D Reconstruction Network Based on Multi-sensor

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-modal Characteristic Guided Depth Completion Network

LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing

A deep learning-based framework for efficient and accurate 3D real-scene reconstruction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

A 3D Reconstruction Network Based on Multi-sensor

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-modal Characteristic Guided Depth Completion Network

LSMD-Net: LiDAR-Stereo Fusion with Mixture Density Network for Depth Sensing

A deep learning-based framework for efficient and accurate 3D real-scene reconstruction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.