Autonomous Motion Article 2018

ClusterNet: Instance Segmentation in RGB-D Images

Screen shot 2018 09 19 at 09.33.59

We propose a method for instance-level segmentation that uses RGB-D data as input and provides detailed information about the location, geometry and number of {\em individual\/} objects in the scene. This level of understanding is fundamental for autonomous robots. It enables safe and robust decision-making under the large uncertainty of the real-world. In our model, we propose to use the first and second order moments of the object occupancy function to represent an object instance. We train an hourglass Deep Neural Network (DNN) where each pixel in the output votes for the 3D position of the corresponding object center and for the object's size and pose. The final instance segmentation is achieved through clustering in the space of moments. The object-centric training loss is defined on the output of the clustering. Our method outperforms the state-of-the-art instance segmentation method on our synthesized dataset. We show that our method generalizes well on real-world data achieving visually better segmentation results.

Author(s): Lin Shao and Ye Tian and Jeannette Bohg
Journal: arXiv
Year: 2018
Month: September
Bibtex Type: Article (article)
State: Submitted
URL: https://arxiv.org/abs/1807.08894
Electronic Archiving: grant_archive
Note: Submitted to ICRA’19

BibTex

@article{2019_ICRA_instSeg,
  title = {ClusterNet: Instance Segmentation in RGB-D Images},
  journal = {arXiv},
  abstract = {We propose a method for instance-level segmentation that uses RGB-D data as input and provides detailed information about the location, geometry and number of {\em individual\/} objects in the scene. This level of understanding is fundamental for autonomous robots. It enables safe and robust decision-making under the large uncertainty of the real-world. In our model, we propose to use the first and second order moments of the object occupancy function to represent an object instance. We train an hourglass Deep Neural Network (DNN) where each pixel in the output votes for the 3D position of the corresponding object center and for the object's size and pose. The final instance segmentation is achieved through clustering in the space of moments. The object-centric training loss is defined on the output of the clustering. Our method outperforms the state-of-the-art instance segmentation method on our synthesized dataset. We show that our method generalizes well on real-world data achieving visually better segmentation results.},
  month = sep,
  year = {2018},
  note = {Submitted to ICRA'19},
  slug = {2019_icra_instseg},
  author = {Shao, Lin and Tian, Ye and Bohg, Jeannette},
  url = {https://arxiv.org/abs/1807.08894},
  month_numeric = {9}
}