Empirical Inference Conference Paper 2008

Correlational Spectral Clustering

We present a new method for spectral clustering with paired data based on kernel canonical correlation analysis, called correlational spectral clustering. Paired data are common in real world data sources, such as images with text captions. Traditional spectral clustering algorithms either assume that data can be represented by a single similarity measure, or by co-occurrence matrices that are then used in biclustering. In contrast, the proposed method uses separate similarity measures for each data representation, and allows for projection of previously unseen data that are only observed in one representation (e.g. images but not text). We show that this algorithm generalizes traditional spectral clustering algorithms and show consistent empirical improvement over spectral clustering on a variety of datasets of images with associated text.

Author(s): Blaschko, MB. and Lampert, CH.
Book Title: CVPR 2008
Journal: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008)
Pages: 1-8
Year: 2008
Month: June
Day: 0
Publisher: IEEE Computer Society
Bibtex Type: Conference Paper (inproceedings)
Address: Los Alamitos, CA, USA
DOI: 10.1109/CVPR.2008.4587353
Event Name: IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Event Place: Anchorage, AK, USA
Digital: 0
Electronic Archiving: grant_archive
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inproceedings{5069,
  title = {Correlational Spectral Clustering},
  journal = {Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008)},
  booktitle = {CVPR 2008},
  abstract = {We present a new method for spectral clustering with
  paired data based on kernel canonical correlation analysis,
  called correlational spectral clustering. Paired data are
  common in real world data sources, such as images with
  text captions. Traditional spectral clustering algorithms either
  assume that data can be represented by a single similarity
  measure, or by co-occurrence matrices that are then
  used in biclustering. In contrast, the proposed method uses
  separate similarity measures for each data representation,
  and allows for projection of previously unseen data that are
  only observed in one representation (e.g. images but not
  text). We show that this algorithm generalizes traditional
  spectral clustering algorithms and show consistent empirical
  improvement over spectral clustering on a variety of
  datasets of images with associated text.},
  pages = {1-8},
  publisher = {IEEE Computer Society},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Los Alamitos, CA, USA},
  month = jun,
  year = {2008},
  slug = {5069},
  author = {Blaschko, MB. and Lampert, CH.},
  month_numeric = {6}
}