In this paper, we present an information-theoretic approach to learning a Mahalanobis distance function. We formulate the problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the distance function. We express this problem as a particular Bregman optimization problem---that of minimizing the LogDet divergence subject to linear constraints. Our resulting algorithm has several advantages over existing methods. First, our method can handle a wide variety of constraints and can optionally incorporate a prior on the distance function. Second, it is fast and scalable. Unlike most existing methods, no eigenvalue computations or semi-definite programming are required. We also present an online version and derive regret bounds for the resulting algorithm. Finally, we evaluate our method on a recent error reporting system for software called Clarify, in the context of metric learning for nearest neighbor classification, as well as on standard data sets.
Author(s): | Davis, JV. and Kulis, B. and Jain, P. and Sra, S. and Dhillon, IS. |
Book Title: | ICML 2007 |
Journal: | Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007) |
Pages: | 209-216 |
Year: | 2007 |
Month: | June |
Day: | 0 |
Editors: | Ghahramani, Z. |
Publisher: | ACM Press |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | New York, NY, USA |
DOI: | 10.1145/1273496.1273523 |
Event Name: | 24th Annual International Conference on Machine Learning |
Event Place: | Corvallis, OR, USA |
Digital: | 0 |
Electronic Archiving: | grant_archive |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
Links: |
BibTex
@inproceedings{5127, title = {Information-theoretic Metric Learning}, journal = {Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007)}, booktitle = {ICML 2007}, abstract = {In this paper, we present an information-theoretic approach to learning a Mahalanobis distance function. We formulate the problem as that of minimizing the differential relative entropy between two multivariate Gaussians under constraints on the distance function. We express this problem as a particular Bregman optimization problem---that of minimizing the LogDet divergence subject to linear constraints. Our resulting algorithm has several advantages over existing methods. First, our method can handle a wide variety of constraints and can optionally incorporate a prior on the distance function. Second, it is fast and scalable. Unlike most existing methods, no eigenvalue computations or semi-definite programming are required. We also present an online version and derive regret bounds for the resulting algorithm. Finally, we evaluate our method on a recent error reporting system for software called Clarify, in the context of metric learning for nearest neighbor classification, as well as on standard data sets.}, pages = {209-216}, editors = {Ghahramani, Z. }, publisher = {ACM Press}, organization = {Max-Planck-Gesellschaft}, school = {Biologische Kybernetik}, address = {New York, NY, USA}, month = jun, year = {2007}, slug = {5127}, author = {Davis, JV. and Kulis, B. and Jain, P. and Sra, S. and Dhillon, IS.}, month_numeric = {6} }