In many applications it is desirable to learn from several kernels. Multiple kernel learning (MKL) allows the practitioner to optimize over linear combinations of kernels. By enforcing sparse coefficients, it also generalizes feature selection to kernel selection. We propose MKL for joint feature maps. This provides a convenient and principled way for MKL with multiclass problems. In addition, we can exploit the joint feature map to learn kernels on output spaces. We show the equivalence of several different primal formulations including different regularizers. We present several optimization methods, and compare a convex quadratically constrained quadratic program (QCQP) and two semi-infinite linear programs (SILPs) toy data, showing that the SILPs are faster than the QCQP. We then demonstrate the utility of our method by applying the SILP to three real world datasets.
Author(s): | Zien, A. and Ong, CS. |
Book Title: | ICML 2007 |
Journal: | Proceedings of the 24th International Conference on Machine Learning (ICML 2007) |
Pages: | 1191-1198 |
Year: | 2007 |
Month: | June |
Day: | 0 |
Editors: | Ghahramani, Z. |
Publisher: | ACM Press |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | New York, NY, USA |
DOI: | 10.1145/1273496.1273646 |
Event Name: | 24th International Conference on Machine Learning |
Event Place: | Corvallis, OR, USA |
Digital: | 0 |
Electronic Archiving: | grant_archive |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
Links: |
BibTex
@inproceedings{4431, title = {Multiclass Multiple Kernel Learning}, journal = {Proceedings of the 24th International Conference on Machine Learning (ICML 2007)}, booktitle = {ICML 2007}, abstract = {In many applications it is desirable to learn from several kernels. Multiple kernel learning (MKL) allows the practitioner to optimize over linear combinations of kernels. By enforcing sparse coefficients, it also generalizes feature selection to kernel selection. We propose MKL for joint feature maps. This provides a convenient and principled way for MKL with multiclass problems. In addition, we can exploit the joint feature map to learn kernels on output spaces. We show the equivalence of several different primal formulations including different regularizers. We present several optimization methods, and compare a convex quadratically constrained quadratic program (QCQP) and two semi-infinite linear programs (SILPs) toy data, showing that the SILPs are faster than the QCQP. We then demonstrate the utility of our method by applying the SILP to three real world datasets.}, pages = {1191-1198}, editors = {Ghahramani, Z. }, publisher = {ACM Press}, organization = {Max-Planck-Gesellschaft}, school = {Biologische Kybernetik}, address = {New York, NY, USA}, month = jun, year = {2007}, slug = {4431}, author = {Zien, A. and Ong, CS.}, month_numeric = {6} }