Skip to main content

Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data

  • Conference paper
  • First Online:
Computer Vision – ACCV 2020 (ACCV 2020)

Abstract

Depth completion aims to predict a dense depth map from a sparse depth input. The acquisition of dense ground truth annotations for depth completion settings can be difficult and, at the same time, a significant domain gap between real LiDAR measurements and synthetic data has prevented from successful training of models in virtual settings. We propose a domain adaptation approach for sparse-to-dense depth completion that is trained from synthetic data, without annotations in the real domain or additional sensors. Our approach simulates the real sensor noise in an RGB + LiDAR set-up, and consists of three modules: simulating the real LiDAR input in the synthetic domain via projections, filtering the real noisy LiDAR for supervision and adapting the synthetic RGB image using a CycleGAN approach. We extensively evaluate these modules against the state-of-the-art in the KITTI depth completion benchmark, showing significant improvements.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Mal, F., Karaman, S.: Sparse-to-dense: depth prediction from sparse depth samples and a single image. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1–8. IEEE (2018)

    Google Scholar 

  2. Van Gansbeke, W., Neven, D., De Brabandere, B., Van Gool, L.: Sparse and noisy lidar completion with RGB guidance and uncertainty. In: International Conference on Machine Vision Applications (MVA), pp. 1–6. IEEE (2019)

    Google Scholar 

  3. Qiu, J., et al.: DeepLiDAR: deep surface normal guided depth prediction for outdoor scene from sparse lidar data and single color image. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3313–3322 (2019)

    Google Scholar 

  4. Ma, F., Cavalheiro, G.V., Karaman, S.: Self-supervised sparse-to-dense: self-supervised depth completion from LiDAR and monocular camera. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 3288–3295. IEEE (2019)

    Google Scholar 

  5. Yang, Y., Wong, A., Soatto, S.: Dense depth posterior (DDP) from single image and sparse range. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3353–3362 (2019)

    Google Scholar 

  6. Xu, Y., Zhu, X., Shi, J., Zhang, G., Bao, H., Li, H.: Depth completion from sparse LiDAR data with depth-normal constraints. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2811–2820 (2019)

    Google Scholar 

  7. Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T., Geiger, A.: Sparsity invariant CNNs. In: Proceedings of the International Conference on 3D Vision (3DV)), pp. 11–20. IEEE (2017)

    Google Scholar 

  8. Wong, A., Fei, X., Tsuei, S., Soatto, S.: Unsupervised depth completion from visual inertial odometry. IEEE Robot. Autom. Lett. 5, 1899–1906 (2020)

    Article  Google Scholar 

  9. Ros, G., Sellart, L., Materzynska, J., Vazquez, D., Lopez, A.M.: The SYNTHIA dataset: a large collection of synthetic images for semantic segmentation of urban scenes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3234–3243 (2016)

    Google Scholar 

  10. Gaidon, A., Wang, Q., Cabon, Y., Vig, E.: Virtual worlds as proxy for multi-object tracking analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  11. Dosovitskiy, A., Ros, G., Codevilla, F., Lopez, A., Koltun, V.: CARLA: an open urban driving simulator. In: Proceedings of the Conference on Robot Learning (CoRL), pp. 1–16 (2017)

    Google Scholar 

  12. Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2223–2232 (2017)

    Google Scholar 

  13. Scharstein, D., Szeliski, R.: A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int. J. Comput. Vision (IJCV) 47, 7–42 (2002)

    Article  Google Scholar 

  14. Lazaros, N., Sirakoulis, G.C., Gasteratos, A.: Review of stereo vision algorithms: from software to hardware. Int. J. Optomechatronics 2, 435–462 (2008)

    Article  Google Scholar 

  15. Tippetts, B., Lee, D.J., Lillywhite, K., Archibald, J.: Review of stereo vision algorithms and their suitability for resource-limited systems. J. Real-Time Image Proc. (JRTIP) 11, 5–25 (2016)

    Article  Google Scholar 

  16. Faugeras, O.D., Lustman, F.: Motion and structure from motion in a piecewise planar environment. Int. J. Pattern Recognit. Artif. Intell. (IJPRAI) 2, 485–508 (1988)

    Article  Google Scholar 

  17. Huang, T.S., Netravali, A.N.: Motion and structure from feature correspondences: a review. In: Advances In Image Processing And Understanding, pp. 331–347. World Scientific (2002)

    Google Scholar 

  18. Handa, A., Whelan, T., McDonald, J., Davison, A.J.: A benchmark for RGB-D visual odometry, 3D reconstruction and slam. In: Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), pp. 1524–1531. IEEE (2014)

    Google Scholar 

  19. Mur-Artal, R., Tardós, J.D.: ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Rob. (T-RO) 33, 1255–1262 (2017)

    Article  Google Scholar 

  20. Engel, J., Koltun, V., Cremers, D.: Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 40, 611–625 (2018)

    Article  Google Scholar 

  21. Laina, I., Rupprecht, C., Belagiannis, V., Tombari, F., Navab, N.: Deeper depth prediction with fully convolutional residual networks. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 239–248. IEEE (2016)

    Google Scholar 

  22. Godard, C., Mac, O., Gabriel, A., Brostow, J.: UCL\(\_\)unsupervised monocular depth estimation with left-right consistency. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), p. 7 (2017)

    Google Scholar 

  23. Guo, X., Li, H., Yi, S., Ren, J., Wang, X.: Learning monocular depth by distilling cross-domain stereo networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 506–523. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_30

    Chapter  Google Scholar 

  24. Li, Z., Snavely, N.: MegaDepth: learning single-view depth prediction from internet photos. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2041–2050 (2018)

    Google Scholar 

  25. Godard, C., Aodha, O.M., Firman, M., Brostow, G.J.: Digging into self-supervised monocular depth estimation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3828–3838 (2019)

    Google Scholar 

  26. Eigen, D., Puhrsch, C., Fergus, R.: Depth map prediction from a single image using a multi-scale deep network. In: Advances in Neural Information Processing Systems (NIPS), pp. 2366–2374 (2014)

    Google Scholar 

  27. Poggi, M., Tosi, F., Mattoccia, S.: Learning monocular depth estimation with unsupervised trinocular assumptions. In: Proceedings of the International Conference on 3D Vision (3DV) (2018)

    Google Scholar 

  28. Klodt, M., Vedaldi, A.: Supervising the new with the old: learning SFM from SFM. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 713–728. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01249-6_43

    Chapter  Google Scholar 

  29. Yang, N., Wang, R., Stückler, J., Cremers, D.: Deep virtual stereo odometry: leveraging deep depth prediction for monocular direct sparse odometry. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 835–852. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_50

    Chapter  Google Scholar 

  30. Watson, J., Firman, M., Brostow, G.J., Turmukhambetov, D.: Self-supervised monocular depth hints. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2162–2171 (2019)

    Google Scholar 

  31. Voynov, O., et al.: Perceptual deep depth super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 5653–5663 (2019)

    Google Scholar 

  32. Lutio, R.d., D’Aronco, S., Wegner, J.D., Schindler, K.: Guided super-resolution as pixel-to-pixel transformation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 8829–8837 (2019)

    Google Scholar 

  33. Riegler, G., Rüther, M., Bischof, H.: ATGV-Net: accurate depth super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 268–284. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_17

    Chapter  Google Scholar 

  34. Ku, J., Harakeh, A., Waslander, S.L.: In defense of classical image processing: fast depth completion on the CPU. In: Proceedings of the Conference on Computer and Robot Vision (CRV), pp. 16–22.. IEEE (2018)

    Google Scholar 

  35. Jaritz, M., De Charette, R., Wirbel, E., Perrotton, X., Nashashibi, F.: Sparse and dense data with CNNs: depth completion and semantic segmentation. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 52–60. IEEE (2018)

    Google Scholar 

  36. Chodosh, N., Wang, C., Lucey, S.: Deep convolutional compressed sensing for LiDAR depth completion. In: Jawahar, C.V., Li, H., Mori, G., Schindler, K. (eds.) ACCV 2018. LNCS, vol. 11361, pp. 499–513. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20887-5_31

    Chapter  Google Scholar 

  37. Eldesokey, A., Felsberg, M., Khan, F.S.: Confidence propagation through CNNs for guided sparse depth regression. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 42, 2423–2436 (2019)

    Article  Google Scholar 

  38. Lee, B.U., Jeon, H.G., Im, S., Kweon, I.S.: Depth completion with deep geometry and context guidance. In: Proceedings of the International Conference on Robotics and Automation (ICRA), pp. 3281–3287. IEEE (2019)

    Google Scholar 

  39. Chen, Y., Yang, B., Liang, M., Urtasun, R.: Learning joint 2D–3D representations for depth completion. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)

    Google Scholar 

  40. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361. IEEE (2012)

    Google Scholar 

  41. Atapour-Abarghouei, A., Breckon, T.P.: Real-time monocular depth estimation using synthetic data with domain adaptation via image style transfer. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2800–2810 (2018)

    Google Scholar 

  42. Zheng, C., Cham, T.-J., Cai, J.: T\(^2\)Net: synthetic-to-realistic translation for solving single-image depth estimation tasks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 798–814. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_47

    Chapter  Google Scholar 

  43. Atapour-Abarghouei, A., Breckon, T.P.: To complete or to estimate, that is the question: a multi-task approach to depth completion and monocular depth estimation. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 183–193. IEEE (2019)

    Google Scholar 

  44. Mayer, N., et al.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vision (IJCV) 126, 942–960 (2018)

    Article  Google Scholar 

  45. Manivasagam, S., et al.: LiDARsim: realistic lidar simulation by leveraging the real world. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11167–11176 (2020)

    Google Scholar 

  46. Yue, X., Wu, B., Seshia, S.A., Keutzer, K., Sangiovanni-Vincentelli, A.L.: A LiDAR point cloud generator: from a virtual world to autonomous driving. In: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval (ICMR), pp. 458–464. ACM (2018)

    Google Scholar 

  47. Huang, Z., Fan, J., Yi, S., Wang, X., Li, H.: HMS-Net: hierarchical multi-scale sparsity-invariant network for sparse depth completion. arXiv preprint arXiv:1808.08685 (2018)

  48. Cheng, X., Zhong, Y., Dai, Y., Ji, P., Li, H.: Noise-aware unsupervised deep Lidar-stereo fusion. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6339–6348 (2019)

    Google Scholar 

  49. Zhao, S., Fu, H., Gong, M., Tao, D.: Geometry-aware symmetric domain adaptation for monocular depth estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9788–9798 (2019)

    Google Scholar 

  50. Li, J., Wong, Y., Zhao, Q., Kankanhalli, M.S.: Learning to learn from noisy labeled data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5051–5059 (2019)

    Google Scholar 

  51. Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Advances in Neural Information Processing Systems (NIPS), pp. 8527–8537 (2018)

    Google Scholar 

  52. Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: Advances in Neural Information Processing Systems (NIPS), pp. 8778–8788 (2018)

    Google Scholar 

  53. Zwald, L., Lambert-Lacroix, S.: The BerHu penalty and the grouped effect. arXiv preprint arXiv:1207.6868 (2012)

  54. Zou, Y., Yu, Z., Vijaya Kumar, B.V.K., Wang, J.: Unsupervised domain adaptation for semantic segmentation via class-balanced self-training. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11207, pp. 297–313. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01219-9_18

    Chapter  Google Scholar 

  55. Tang, K., Ramanathan, V., Fei-Fei, L., Koller, D.: Shifting weights: adapting object detectors from image to video. In: Advances in Neural Information Processing Systems (NIPS), pp. 638–646 (2012)

    Google Scholar 

  56. Paszke, A., et al.: Automatic differentiation in PyTorch. In: Advances in Neural Information Processing Systems (NIPS), Autodiff Workshop (2017)

    Google Scholar 

  57. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2015)

    Google Scholar 

  58. Silberman, N., Hoiem, D., Kohli, P., Fergus, R.: Indoor segmentation and support inference from RGBD images. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7576, pp. 746–760. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33715-4_54

    Chapter  Google Scholar 

  59. Pilzer, A., Xu, D., Puscas, M., Ricci, E., Sebe, N.: Unsupervised adversarial depth estimation using cycled generative networks. In: Proceedings of the International Conference on 3D Vision (3DV), pp. 587–595. IEEE (2018)

    Google Scholar 

  60. Tsai, Y.H., Hung, W.C., Schulter, S., Sohn, K., Yang, M.H., Chandraker, M.: Learning to adapt structured output space for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7472–7481 (2018)

    Google Scholar 

  61. Scale AI: Pandaset (2020). https://scale.com/open-datasets/pandaset

Download references

Acknowledgements

This research was supported by UK EPSRC IPALM project EP/S032398/1.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Adrian Lopez-Rodriguez .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 209 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lopez-Rodriguez, A., Busam, B., Mikolajczyk, K. (2021). Project to Adapt: Domain Adaptation for Depth Completion from Noisy and Sparse Sensor Data. In: Ishikawa, H., Liu, CL., Pajdla, T., Shi, J. (eds) Computer Vision – ACCV 2020. ACCV 2020. Lecture Notes in Computer Science(), vol 12622. Springer, Cham. https://doi.org/10.1007/978-3-030-69525-5_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-69525-5_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-69524-8

  • Online ISBN: 978-3-030-69525-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy