Empirical Inference Conference Paper 2005

Hilbertian Metrics and Positive Definite Kernels on Probability Measures

We investigate the problem of defining Hilbertian metrics resp. positive definite kernels on probability measures, continuing previous work. This type of kernels has shown very good results in text classification and has a wide range of possible applications. In this paper we extend the two-parameter family of Hilbertian metrics of Topsoe such that it now includes all commonly used Hilbertian metrics on probability measures. This allows us to do model selection among these metrics in an elegant and unified way. Second we investigate further our approach to incorporate similarity information of the probability space into the kernel. The analysis provides a better understanding of these kernels and gives in some cases a more efficient way to compute them. Finally we compare all proposed kernels in two text and two image classification problems.

Author(s): Hein, M. and Bousquet, O.
Book Title: AISTATS 2005
Journal: Proceedings of AISTATS 2005
Pages: 136-143
Year: 2005
Month: January
Day: 0
Editors: Cowell, R. , Z. Ghahramani
Bibtex Type: Conference Paper (inproceedings)
Event Name: Tenth International Workshop on Artificial Intelligence and Statistics (AI & Statistics 2005)
Event Place: Barbados
Digital: 0
Electronic Archiving: grant_archive
ISBN: 0-9727358-1-X
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inproceedings{3013,
  title = {Hilbertian Metrics and Positive Definite Kernels on Probability Measures},
  journal = {Proceedings of AISTATS 2005},
  booktitle = {AISTATS 2005},
  abstract = {We investigate the problem of defining Hilbertian metrics resp.
  positive definite kernels on probability measures, continuing previous work. This type of kernels has shown very good
  results in text classification and has a wide range of possible
  applications. In this paper we extend the two-parameter family of
  Hilbertian metrics of Topsoe such that it now includes all
  commonly used Hilbertian metrics on probability measures. This
  allows us to do model selection among these metrics in an elegant
  and unified way. Second we investigate further our approach to
  incorporate similarity information of the probability space into
  the kernel. The analysis provides a better understanding of these
  kernels and gives in some cases a more efficient way to compute
  them. Finally we compare all proposed kernels in two text and two
  image classification problems.},
  pages = {136-143},
  editors = {Cowell, R. , Z. Ghahramani},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  month = jan,
  year = {2005},
  slug = {3013},
  author = {Hein, M. and Bousquet, O.},
  month_numeric = {1}
}