Clustering: Science or Art? | Max Planck Institute for Intelligent Systems

Institute Homepage

Institute Homepage EN Sign In

Empirische Inferenz Conference Paper 2012

Clustering: Science or Art?

Thumb ticker sm ulrike luxburg

Statistical Learning Theory

Ulrike von Luxburg

Professor, University of Tübingen
Max Planck Fellow

We examine whether the quality of dierent clustering algorithms can be compared by a general, scientically sound procedure which is independent of particular clustering algorithms. We argue that the major obstacle is the diculty in evaluating a clustering algorithm without taking into account the context: why does the user cluster his data in the rst place, and what does he want to do with the clustering afterwards? We argue that clustering should not be treated as an application-independent mathematical problem, but should always be studied in the context of its end-use. Dierent techniques to evaluate clustering algorithms have to be developed for dierent uses of clustering. To simplify this procedure we argue that it will be useful to build a \taxonomy of clustering problems" to identify clustering applications which can be treated in a unied way and that such an eort will be more fruitful than attempting the impossible | developing \optimal" domain-independent clustering algorithms or even classifying clustering algorithms in terms of how they work.

Author(s):	von Luxburg, U. and Williamson, R. and Guyon, I.
Book Title:	JMLR Workshop and Conference Proceedings, Volume 27
Pages:	65-79
Year:	2012
Day:	0

Bibtex Type:	Conference Paper (inproceedings)

Event Name:	Workshop on Unsupervised Learning and Transfer Learning
Event Place:	Bellevue, Washington, USA

Digital:	0
Electronic Archiving:	grant_archive
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	PDF

BibTex

@inproceedings{6331,
  title = {Clustering: Science or Art?},
  booktitle = {JMLR Workshop and Conference Proceedings, Volume 27},
  abstract = {We examine whether the quality of dierent clustering algorithms can be compared by a
  general, scientically sound procedure which is independent of particular clustering algorithms.
  We argue that the major obstacle is the diculty in evaluating a clustering algorithm without
  taking into account the context: why does the user cluster his data in the rst place, and what
  does he want to do with the clustering afterwards? We argue that clustering should not be
  treated as an application-independent mathematical problem, but should always be studied
  in the context of its end-use. Dierent techniques to evaluate clustering algorithms have to be
  developed for dierent uses of clustering. To simplify this procedure we argue that it will be
  useful to build a \taxonomy of clustering problems" to identify clustering applications which
  can be treated in a unied way and that such an eort will be more fruitful than attempting
  the impossible | developing \optimal" domain-independent clustering algorithms or even
  classifying clustering algorithms in terms of how they work.},
  pages = {65-79},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  year = {2012},
  slug = {6331},
  author = {von Luxburg, U. and Williamson, R. and Guyon, I.}
}