Empirical Inference Technical Report 2007

Efficient Subwindow Search for Object Localization

Recent years have seen huge advances in object recognition from images. Recognition rates beyond 95% are the rule rather than the exception on many datasets. However, most state-of-the-art methods can only decide if an object is present or not. They are not able to provide information on the object location or extent within in the image. We report on a simple yet powerful scheme that extends many existing recognition methods to also perform localization of object bounding boxes. This is achieved by maximizing the classification score over all possible subrectangles in the image. Despite the impression that this would be computationally intractable, we show that in many situations efficient algorithms exist which solve a generalized maximum subrectangle problem. We show how our method is applicable to a variety object detection frameworks and demonstrate its performance by applying it to the popular bag of visual words model, achieving competitive results on the PASCAL VOC 2006 dataset.

Author(s): Blaschko, MB. and Hofmann, T. and Lampert, CH.
Number (issue): 164
Year: 2007
Month: August
Day: 0
Bibtex Type: Technical Report (techreport)
Digital: 0
Electronic Archiving: grant_archive
Institution: Max Planck Institute for Biological Cybernetics, Tübingen, Germany
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@techreport{4746,
  title = {Efficient Subwindow Search for Object Localization},
  abstract = {Recent years have seen huge advances in object recognition from images. Recognition rates beyond 95% are the rule rather than the exception on many datasets. However, most state-of-the-art methods can only decide if an object is present or not. They are not able to provide information on the object location or extent within in the image.
  We report on a simple yet powerful scheme that extends many existing recognition methods to also perform localization of object bounding boxes. This is achieved by maximizing the classification score over all possible subrectangles in the image. Despite the impression that this would be computationally intractable, we show that in many situations efficient algorithms exist which solve a generalized maximum subrectangle problem.
  We show how our method is applicable to a variety object detection frameworks and demonstrate its performance by applying it to the popular bag of visual words model, achieving competitive results on the PASCAL VOC 2006 dataset.},
  number = {164},
  organization = {Max-Planck-Gesellschaft},
  institution = {Max Planck Institute for Biological Cybernetics, Tübingen, Germany},
  school = {Biologische Kybernetik},
  month = aug,
  year = {2007},
  slug = {4746},
  author = {Blaschko, MB. and Hofmann, T. and Lampert, CH.},
  month_numeric = {8}
}