Empirical Inference Conference Paper 2007

Multiclass Multiple Kernel Learning

In many applications it is desirable to learn from several kernels. “Multiple kernel learning” (MKL) allows the practitioner to optimize over linear combinations of kernels. By enforcing sparse coefficients, it also generalizes feature selection to kernel selection. We propose MKL for joint feature maps. This provides a convenient and principled way for MKL with multiclass problems. In addition, we can exploit the joint feature map to learn kernels on output spaces. We show the equivalence of several different primal formulations including different regularizers. We present several optimization methods, and compare a convex quadratically constrained quadratic program (QCQP) and two semi-infinite linear programs (SILPs) toy data, showing that the SILPs are faster than the QCQP. We then demonstrate the utility of our method by applying the SILP to three real world datasets.

Author(s): Zien, A. and Ong, CS.
Book Title: ICML 2007
Journal: Proceedings of the 24th International Conference on Machine Learning (ICML 2007)
Pages: 1191-1198
Year: 2007
Month: June
Day: 0
Editors: Ghahramani, Z.
Publisher: ACM Press
Bibtex Type: Conference Paper (inproceedings)
Address: New York, NY, USA
DOI: 10.1145/1273496.1273646
Event Name: 24th International Conference on Machine Learning
Event Place: Corvallis, OR, USA
Digital: 0
Electronic Archiving: grant_archive
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inproceedings{4431,
  title = {Multiclass Multiple Kernel Learning},
  journal = {Proceedings of the 24th International Conference on Machine Learning (ICML 2007)},
  booktitle = {ICML 2007},
  abstract = {In many applications it is desirable to learn from several kernels. “Multiple kernel learning” (MKL) allows the practitioner to optimize over linear combinations of kernels. By enforcing sparse coefficients, it also generalizes feature selection to kernel selection. We propose MKL for joint feature maps. This provides a convenient and principled way for MKL with multiclass problems. In addition, we can exploit the joint feature map to learn kernels on output spaces. We show the equivalence of several different primal formulations including different regularizers. We present several optimization methods, and compare a convex quadratically constrained quadratic program (QCQP) and two semi-infinite linear programs (SILPs) toy data, showing that the SILPs are faster than the QCQP. We then demonstrate the utility of our method by applying the SILP to three real world datasets.},
  pages = {1191-1198},
  editors = {Ghahramani, Z. },
  publisher = {ACM Press},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {New York, NY, USA},
  month = jun,
  year = {2007},
  slug = {4431},
  author = {Zien, A. and Ong, CS.},
  month_numeric = {6}
}