Abstract
Depth completion aims to predict a dense depth map from a sparse depth input. The acquisition of dense ground truth annotations for depth completion settings can be difficult and, at the same time, a significant domain gap between real LiDAR measurements and synthetic data has prevented from successful training of models in virtual settings. We propose a domain adaptation approach for sparse-to-dense depth completion that is trained from synthetic data, without annotations in the real domain or additional sensors. Our approach simulates the real sensor noise in an RGB + LiDAR set-up, and consists of three modules: simulating the real LiDAR input in the synthetic domain via projections, filtering the real noisy LiDAR for supervision and adapting the synthetic RGB image using a CycleGAN approach. We extensively evaluate these modules against the state-of-the-art in the KITTI depth completion benchmark, showing significant improvements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mal, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8. IEEE (2018)
Van Gansbeke, W., Neven, D., De Brabandere, B., Van Gool, L.: Sparse and noisy lidar completion with RGB guidance and uncertainty. In: International Conference on Machine Vision Applications (MVA), pp. 1–6. IEEE (2019)
Qiu, J., et al.: DeepLiDAR: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3313–3322 (2019)
Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: self-supervised depth completion from LiDAR and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3288–3295. IEEE (2019)
Yang, Y., Wong, A., Soatto, S.: Dense depth posterior (DDP) from single image and sparse range. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3353–3362 (2019)
Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse LiDAR data with depth-normal constraints. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2811–2820 (2019)
Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: Proceedings of the International Conference on 3D Vision (3DV)), pp. 11–20. IEEE (2017)
Wong, A., Fei, X., Tsuei, S., Soatto, S.: Unsupervised depth completion from visual inertial odometry. IEEE Robot. Autom. Lett. 5, 1899–1906 (2020)
Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)
Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the Conference on Robot Learning (CoRL), pp. 1–16 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2223–2232 (2017)
Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision (IJCV) 47, 7–42 (2002)
Lazaros, N., Sirakoulis, G.C., Gasteratos, A.: Review of stereo vision algorithms: from software to hardware. Int. J. Optomechatronics 2, 435–462 (2008)
Tippetts, B., Lee, D.J., Lillywhite, K., Archibald, J.: Review of stereo vision algorithms and their suitability for resource-limited systems. J. Real-Time Image Proc. (JRTIP) 11, 5–25 (2016)
Faugeras, O.D., Lustman, F.: Motion and structure from motion in a piecewise planar environment. Int. J. Pattern Recognit. Artif. Intell. (IJPRAI) 2, 485–508 (1988)
Huang, T.S., Netravali, A.N.: Motion and structure from feature correspondences: a review. In: Advances In Image Processing And Understanding, pp. 331–347. World Scientific (2002)
Handa, A., Whelan, T., McDonald, J., Davison, A.J.: A benchmark for RGB-D visual odometry, 3D reconstruction and slam. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 1524–1531. IEEE (2014)
Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Rob. (T-RO) 33, 1255–1262 (2017)
Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40, 611–625 (2018)
Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 239–248. IEEE (2016)
Godard, C., Mac, O., Gabriel, A., Brostow, J.: UCL\(\_\)unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 7 (2017)
Guo, X., Li, H., Yi, S., Ren, J., Wang, X.: Learning monocular depth by distilling cross-domain stereo networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 506–523. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_30
Li, Z., Snavely, N.: MegaDepth: learning single-view depth prediction from internet photos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2041–2050 (2018)
Godard, C., Aodha, O.M., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3828–3838 (2019)
Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems (NIPS), pp. 2366–2374 (2014)
Poggi, M., Tosi, F., Mattoccia, S.: Learning monocular depth estimation with unsupervised trinocular assumptions. In: Proceedings of the International Conference on 3D Vision (3DV) (2018)
Klodt, M., Vedaldi, A.: Supervising the new with the old: learning SFM from SFM. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 713–728. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_43
Yang, N., Wang, R., Stückler, J., Cremers, D.: Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 835–852. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_50
Watson, J., Firman, M., Brostow, G.J., Turmukhambetov, D.: Self-supervised monocular depth hints. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2162–2171 (2019)
Voynov, O., et al.: Perceptual deep depth super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5653–5663 (2019)
Lutio, R.d., D’Aronco, S., Wegner, J.D., Schindler, K.: Guided super-resolution as pixel-to-pixel transformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8829–8837 (2019)
Riegler, G., Rüther, M., Bischof, H.: ATGV-Net: accurate depth super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 268–284. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_17
Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: fast depth completion on the CPU. In: Proceedings of the Conference on Computer and Robot Vision (CRV), pp. 16–22.. IEEE (2018)
Jaritz, M., De Charette, R., Wirbel, E., Perrotton, X., Nashashibi, F.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 52–60. IEEE (2018)
Chodosh, N., Wang, C., Lucey, S.: Deep convolutional compressed sensing for LiDAR depth completion. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 499–513. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_31
Eldesokey, A., Felsberg, M., Khan, F.S.: Confidence propagation through CNNs for guided sparse depth regression. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 42, 2423–2436 (2019)
Lee, B.U., Jeon, H.G., Im, S., Kweon, I.S.: Depth completion with deep geometry and context guidance. In: Proceedings of the International Conference on Robotics and Automation (ICRA), pp. 3281–3287. IEEE (2019)
Chen, Y., Yang, B., Liang, M., Urtasun, R.: Learning joint 2D–3D representations for depth completion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361. IEEE (2012)
Atapour-Abarghouei, A., Breckon, T.P.: Real-time monocular depth estimation using synthetic data with domain adaptation via image style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2800–2810 (2018)
Zheng, C., Cham, T.-J., Cai, J.: T\(^2\)Net: synthetic-to-realistic translation for solving single-image depth estimation tasks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 798–814. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_47
Atapour-Abarghouei, A., Breckon, T.P.: To complete or to estimate, that is the question: a multi-task approach to depth completion and monocular depth estimation. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 183–193. IEEE (2019)
Mayer, N., et al.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vision (IJCV) 126, 942–960 (2018)
Manivasagam, S., et al.: LiDARsim: realistic lidar simulation by leveraging the real world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11167–11176 (2020)
Yue, X., Wu, B., Seshia, S.A., Keutzer, K., Sangiovanni-Vincentelli, A.L.: A LiDAR point cloud generator: from a virtual world to autonomous driving. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval (ICMR), pp. 458–464. ACM (2018)
Huang, Z., Fan, J., Yi, S., Wang, X., Li, H.: HMS-Net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. arXiv preprint arXiv:1808.08685 (2018)
Cheng, X., Zhong, Y., Dai, Y., Ji, P., Li, H.: Noise-aware unsupervised deep Lidar-stereo fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6339–6348 (2019)
Zhao, S., Fu, H., Gong, M., Tao, D.: Geometry-aware symmetric domain adaptation for monocular depth estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9788–9798 (2019)
Li, J., Wong, Y., Zhao, Q., Kankanhalli, M.S.: Learning to learn from noisy labeled data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5051–5059 (2019)
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Advances in Neural Information Processing Systems (NIPS), pp. 8527–8537 (2018)
Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: Advances in Neural Information Processing Systems (NIPS), pp. 8778–8788 (2018)
Zwald, L., Lambert-Lacroix, S.: The BerHu penalty and the grouped effect. arXiv preprint arXiv:1207.6868 (2012)
Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 297–313. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_18
Tang, K., Ramanathan, V., Fei-Fei, L., Koller, D.: Shifting weights: adapting object detectors from image to video. In: Advances in Neural Information Processing Systems (NIPS), pp. 638–646 (2012)
Paszke, A., et al.: Automatic differentiation in PyTorch. In: Advances in Neural Information Processing Systems (NIPS), Autodiff Workshop (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2015)
Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54
Pilzer, A., Xu, D., Puscas, M., Ricci, E., Sebe, N.: Unsupervised adversarial depth estimation using cycled generative networks. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 587–595. IEEE (2018)
Tsai, Y.H., Hung, W.C., Schulter, S., Sohn, K., Yang, M.H., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7472–7481 (2018)
Scale AI: Pandaset (2020). https://scale.com/open-datasets/pandaset
Acknowledgements
This research was supported by UK EPSRC IPALM project EP/S032398/1.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Lopez-Rodriguez, A., Busam, B., Mikolajczyk, K. (2021). Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12622. Springer, Cham. https://doi.org/10.1007/978-3-030-69525-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-69525-5_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69524-8
Online ISBN: 978-3-030-69525-5
eBook Packages: Computer ScienceComputer Science (R0)