Perceiving Systems Autonomous Vision Conference Paper 2015

Joint 3D Object and Layout Inference from a single RGB-D Image

Thumb ticker sm upper body tiny
Autonomous Vision, Perceiving Systems
Guest Scientist
Thumb ticker sm thumb chaohui
Perceiving Systems
Geiger

Inferring 3D objects and the layout of indoor scenes from a single RGB-D image captured with a Kinect camera is a challenging task. Towards this goal, we propose a high-order graphical model and jointly reason about the layout, objects and superpixels in the image. In contrast to existing holistic approaches, our model leverages detailed 3D geometry using inverse graphics and explicitly enforces occlusion and visibility constraints for respecting scene properties and projective geometry. We cast the task as MAP inference in a factor graph and solve it efficiently using message passing. We evaluate our method with respect to several baselines on the challenging NYUv2 indoor dataset using 21 object categories. Our experiments demonstrate that the proposed method is able to infer scenes with a large degree of clutter and occlusions.

Award: (Best Paper Award)
Author(s): Andreas Geiger and Chaohui Wang
Book Title: German Conference on Pattern Recognition (GCPR)
Volume: 9358
Pages: 183--195
Year: 2015
Series: Lecture Notes in Computer Science
Publisher: Springer International Publishing
Bibtex Type: Conference Paper (inproceedings)
DOI: 10.1007/978-3-319-24947-6_15
Event Place: Aachen
Award Paper: Best Paper Award
Electronic Archiving: grant_archive
ISBN: 978-3-319-24946-9
Links:

BibTex

@inproceedings{Geiger2015GCPR,
  title = {Joint 3D Object and Layout Inference from a single RGB-D Image},
  aword_paper = {Best Paper Award},
  booktitle = {German Conference on Pattern Recognition (GCPR)},
  abstract = {Inferring 3D objects and the layout of indoor scenes from a single RGB-D image captured with a Kinect camera is a challenging task. Towards this goal, we propose a high-order graphical model and jointly reason about the layout, objects and superpixels in the image. In contrast to existing holistic approaches, our model leverages detailed 3D geometry using inverse graphics and explicitly enforces occlusion and visibility constraints for respecting scene properties and projective geometry. We cast the task as MAP inference in a factor graph and solve it efficiently using message passing. We evaluate our method with respect to several baselines on the challenging NYUv2 indoor dataset using 21 object categories. Our experiments demonstrate that the proposed method is able to infer scenes with a large degree of clutter and occlusions.},
  volume = {9358},
  pages = {183--195},
  series = {Lecture Notes in Computer Science},
  publisher = {Springer International Publishing},
  year = {2015},
  slug = {geiger2015gcpr},
  author = {Geiger, Andreas and Wang, Chaohui}
}