Empirical Inference Conference Paper 2005

Limits of Spectral Clustering

An important aspect of clustering algorithms is whether the partitions constructed on finite samples converge to a useful clustering of the whole data space as the sample size increases. This paper investigates this question for normalized and unnormalized versions of the popular spectral clustering algorithm. Surprisingly, the convergence of unnormalized spectral clustering is more difficult to handle than the normalized case. Even though recently some first results on the convergence of normalized spectral clustering have been obtained, for the unnormalized case we have to develop a completely new approach combining tools from numerical integration, spectral and perturbation theory, and probability. It turns out that while in the normalized case, spectral clustering usually converges to a nice partition of the data space, in the unnormalized case the same only holds under strong additional assumptions which are not always satisfied. We conclude that our analysis gives strong evidence for the superiority of normalized spectral clustering. It also provides a basis for future exploration of other Laplacian-based methods.

Author(s): von Luxburg, U. and Bousquet, O. and Belkin, M.
Book Title: Advances in Neural Information Processing Systems 17
Journal: Advances in Neural Information Processing Systems 17: Proceedings of the 2004 Conference
Pages: 857-864
Year: 2005
Month: July
Day: 0
Editors: Saul, L. K., Y. Weiss, L. Bottou
Publisher: MIT Press
Bibtex Type: Conference Paper (inproceedings)
Address: Cambridge, MA, USA
Event Name: Eighteenth Annual Conference on Neural Information Processing Systems (NIPS 2004)
Event Place: Vancouver, BC, Kanada
Digital: 0
Electronic Archiving: grant_archive
ISBN: 0-262-19534-8
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inproceedings{2775,
  title = {Limits of Spectral Clustering},
  journal = {Advances in Neural Information Processing Systems 17: Proceedings of the 2004 Conference},
  booktitle = {Advances in Neural Information Processing Systems 17},
  abstract = {An important aspect of clustering algorithms is whether the partitions constructed on finite samples converge to a useful clustering of the whole data space as the sample size increases. This paper investigates this question for normalized and unnormalized versions of the popular spectral
  clustering algorithm. Surprisingly, the convergence of unnormalized spectral clustering is more difficult to handle than the normalized case. Even though recently some first results on the convergence of normalized spectral clustering have been obtained, for the unnormalized case
  we have to develop a completely new approach combining tools from numerical integration, spectral and perturbation theory, and probability. It turns out that while in the normalized case, spectral clustering usually converges to a nice partition of the data space, in the unnormalized case
  the same only holds under strong additional assumptions which are not always satisfied. We conclude that our analysis gives strong evidence for the superiority of normalized spectral clustering. It also provides a basis
  for future exploration of other Laplacian-based methods.},
  pages = {857-864},
  editors = {Saul, L. K., Y. Weiss, L. Bottou},
  publisher = {MIT Press},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Cambridge, MA, USA},
  month = jul,
  year = {2005},
  slug = {2775},
  author = {von Luxburg, U. and Bousquet, O. and Belkin, M.},
  month_numeric = {7}
}