Perceiving Systems Conference Paper 2014

Discovering Object Classes from Activities

Teaser 200 10

In order to avoid an expensive manual labeling process or to learn object classes autonomously without human intervention, object discovery techniques have been proposed that extract visual similar objects from weakly labelled videos. However, the problem of discovering small or medium sized objects is largely unexplored. We observe that videos with activities involving human-object interactions can serve as weakly labelled data for such cases. Since neither object appearance nor motion is distinct enough to discover objects in these videos, we propose a framework that samples from a space of algorithms and their parameters to extract sequences of object proposals. Furthermore, we model similarity of objects based on appearance and functionality, which is derived from human and object motion. We show that functionality is an important cue for discovering objects from activities and demonstrate the generality of the model on three challenging RGB-D and RGB datasets.

Author(s): Abhilash Srikantha and Juergen Gall
Book Title: European Conference on Computer Vision
Volume: 8694
Pages: 415-430
Year: 2014
Month: September
Series: Lecture Notes in Computer Science
Editors: D. Fleet and T. Pajdla and B. Schiele and T. Tuytelaars
Publisher: Springer International Publishing
Project(s):
Bibtex Type: Conference Paper (inproceedings)
DOI: 10.1007/978-3-319-10599-4_27
Event Name: 13th European Conference on Computer Vision
Event Place: Zürich, Switzerland
Electronic Archiving: grant_archive
Attachments:

BibTex

@inproceedings{Srik:ECCV:2014,
  title = {Discovering Object Classes from Activities},
  booktitle = {European Conference on Computer Vision},
  abstract = { In order to avoid an expensive manual labeling process or to learn object classes autonomously without human intervention, object discovery techniques have been proposed that extract visual similar objects from weakly labelled videos. However, the problem of discovering small or medium sized objects is largely unexplored. We observe that videos with activities involving human-object interactions can serve as weakly labelled data for such cases. Since neither object appearance nor motion is distinct enough to discover objects in these videos, we propose a framework that samples from a space of algorithms and their parameters to extract sequences of object proposals. Furthermore, we model similarity of objects based on appearance and functionality, which is derived from human and object motion. We show that functionality is an
  important cue for discovering objects from activities and demonstrate the generality of the model on three challenging RGB-D and RGB datasets.
  },
  volume = {8694},
  pages = {415-430},
  series = {Lecture Notes in Computer Science},
  editors = {D. Fleet  and T. Pajdla and B. Schiele  and T. Tuytelaars },
  publisher = {Springer International Publishing},
  month = sep,
  year = {2014},
  slug = {srik-eccv-2014},
  author = {Srikantha, Abhilash and Gall, Juergen},
  month_numeric = {9}
}