Autonomous Vision Conference Paper 2019

Learning Non-volumetric Depth Fusion using Successive Reprojections

Donne

Given a set of input views, multi-view stereopsis techniques estimate depth maps to represent the 3D reconstruction of the scene; these are fused into a single, consistent, reconstruction -- most often a point cloud. In this work we propose to learn an auto-regressive depth refinement directly from data. While deep learning has improved the accuracy and speed of depth estimation significantly, learned MVS techniques remain limited to the planesweeping paradigm. We refine a set of input depth maps by successively reprojecting information from neighbouring views to leverage multi-view constraints. Compared to learning-based volumetric fusion techniques, an image-based representation allows significantly more detailed reconstructions; compared to traditional point-based techniques, our method learns noise suppression and surface completion in a data-driven fashion. Due to the limited availability of high-quality reconstruction datasets with ground truth, we introduce two novel synthetic datasets to (pre-)train our network. Our approach is able to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.

Author(s): Simon Donne and Andreas Geiger
Book Title: Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)
Year: 2019
Month: June
Bibtex Type: Conference Paper (inproceedings)
Event Name: IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019
Event Place: Long Beach, USA
Electronic Archiving: grant_archive
Links:

BibTex

@inproceedings{Donne2019CVPR,
  title = {Learning Non-volumetric Depth Fusion using Successive Reprojections },
  booktitle = {Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR)},
  abstract = {Given a set of input views, multi-view stereopsis techniques estimate depth maps to represent the 3D reconstruction of the scene; these are fused into a single, consistent, reconstruction -- most often a point cloud. In this work we propose to learn an auto-regressive depth refinement directly from data. While deep learning has improved the accuracy and speed of depth estimation significantly, learned MVS techniques remain limited to the planesweeping paradigm. We refine a set of input depth maps by successively reprojecting information from neighbouring views to leverage multi-view constraints. Compared to learning-based volumetric fusion techniques, an image-based representation allows significantly more detailed reconstructions; compared to traditional point-based techniques, our method learns noise suppression and surface completion in a data-driven fashion. Due to the limited availability of high-quality reconstruction datasets with ground truth, we introduce two novel synthetic datasets to (pre-)train our network. Our approach is able to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.},
  month = jun,
  year = {2019},
  slug = {donne2019cvpr},
  author = {Donne, Simon and Geiger, Andreas},
  month_numeric = {6}
}