Empirical Inference Conference Paper 2008

Learning to Localize Objects with Structured Output Regression

Sliding window classifiers are among the most successful and widely applied techniques for object localization. However, training is typically done in a way that is not specific to the localization task. First a binary classifier is trained using a sample of positive and negative examples, and this classifier is subsequently applied to multiple regions within test images. We propose instead to treat object localization in a principled way by posing it as a problem of predicting structured data: we model the problem not as binary classification, but as the prediction of the bounding box of objects located in images. The use of a joint-kernel framework allows us to formulate the training procedure as a generalization of an SVM, which can be solved efficiently. We further improve computational efficiency by using a branch-and-bound strategy for localization during both training and testing. Experimental evaluation on the PASCAL VOC and TU Darmstadt datasets show that the structured training procedure improves pe rformance over binary training as well as the best previously published scores.

Author(s): Blaschko, MB. and Lampert, CH.
Book Title: ECCV 2008
Journal: Computer Vision: ECCV 2008
Pages: 2-15
Year: 2008
Month: October
Day: 0
Editors: Forsyth, D. A., P. H.S. Torr, A. Zisserman
Publisher: Springer
Bibtex Type: Conference Paper (inproceedings)
Address: Berlin, Germany
DOI: 10.1007/978-3-540-88682-2_2
Event Name: 10th European Conference on Computer Vision
Event Place: Marseille, France
Digital: 0
Electronic Archiving: grant_archive
Language: en
Note: Best Student Paper Award
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inproceedings{5247,
  title = {Learning to Localize Objects with Structured Output Regression},
  journal = {Computer Vision: ECCV 2008},
  booktitle = {ECCV 2008},
  abstract = {Sliding window classifiers are among the most successful and widely applied techniques for object localization. However, training is typically done in a way that is not specific to the localization task. First a binary classifier is trained using a sample of positive and negative examples, and this classifier is subsequently applied to multiple regions within test images. We propose instead to treat object localization in a principled way by posing it as a problem of predicting structured data: we model the problem not as binary classification, but as the prediction of the bounding box of objects located in images. The use of a joint-kernel framework allows us to formulate the training procedure as a generalization of an SVM, which can be solved efficiently. We further improve computational efficiency by using a branch-and-bound strategy for localization during both training and testing. Experimental evaluation on the PASCAL VOC and TU Darmstadt datasets show that the structured training procedure improves pe
  rformance over binary training as well as the best previously published scores.},
  pages = {2-15},
  editors = {Forsyth, D. A., P. H.S. Torr, A. Zisserman},
  publisher = {Springer},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Berlin, Germany},
  month = oct,
  year = {2008},
  note = {Best Student Paper Award},
  slug = {5247},
  author = {Blaschko, MB. and Lampert, CH.},
  month_numeric = {10}
}