We introduce a family of unsupervised algorithms, numerical taxonomy clustering, to simultaneously cluster data, and to learn a taxonomy that encodes the relationship between the clusters. The algorithms work by maximizing the dependence between the taxonomy and the original data. The resulting taxonomy is a more informative visualization of complex data than simple clustering; in addition, taking into account the relations between different clusters is shown to substantially improve the quality of the clustering, when compared with state-ofthe-art algorithms in the literature (both spectral clustering and a previous dependence maximization approach). We demonstrate our algorithm on image and text data.
Author(s): | Blaschko, MB. and Gretton, A. |
Book Title: | Advances in neural information processing systems 21 |
Journal: | Advances in neural information processing systems 21 : 22nd Annual Conference on Neural Information Processing Systems 2008 |
Pages: | 153-160 |
Year: | 2009 |
Month: | June |
Day: | 0 |
Editors: | Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou |
Publisher: | Curran |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | Red Hook, NY, USA |
Event Name: | Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS 2008) |
Event Place: | Vancouver, BC, Canada |
Digital: | 0 |
Electronic Archiving: | grant_archive |
ISBN: | 978-1-605-60949-2 |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
Links: |
BibTex
@inproceedings{5396, title = {Learning Taxonomies by Dependence Maximization}, journal = {Advances in neural information processing systems 21 : 22nd Annual Conference on Neural Information Processing Systems 2008}, booktitle = {Advances in neural information processing systems 21}, abstract = {We introduce a family of unsupervised algorithms, numerical taxonomy clustering, to simultaneously cluster data, and to learn a taxonomy that encodes the relationship between the clusters. The algorithms work by maximizing the dependence between the taxonomy and the original data. The resulting taxonomy is a more informative visualization of complex data than simple clustering; in addition, taking into account the relations between different clusters is shown to substantially improve the quality of the clustering, when compared with state-ofthe-art algorithms in the literature (both spectral clustering and a previous dependence maximization approach). We demonstrate our algorithm on image and text data.}, pages = {153-160}, editors = {Koller, D. , D. Schuurmans, Y. Bengio, L. Bottou}, publisher = {Curran}, organization = {Max-Planck-Gesellschaft}, school = {Biologische Kybernetik}, address = {Red Hook, NY, USA}, month = jun, year = {2009}, slug = {5396}, author = {Blaschko, MB. and Gretton, A.}, month_numeric = {6} }