An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models
We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.
Author(s): | Keerthi, SS. and Sindhwani, V. and Chapelle, O. |
Book Title: | Advances in Neural Information Processing Systems 19 |
Journal: | Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference |
Pages: | 673-680 |
Year: | 2007 |
Month: | September |
Day: | 0 |
Editors: | Sch{\"o}lkopf, B. , J. Platt, T. Hofmann |
Publisher: | MIT Press |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | Cambridge, MA, USA |
Event Name: | Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006) |
Event Place: | Vancouver, BC, Canada |
Digital: | 0 |
Electronic Archiving: | grant_archive |
ISBN: | 0-262-19568-2 |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
Links: |
BibTex
@inproceedings{5371, title = {An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models}, journal = {Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference}, booktitle = {Advances in Neural Information Processing Systems 19}, abstract = {We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.}, pages = {673-680}, editors = {Sch{\"o}lkopf, B. , J. Platt, T. Hofmann}, publisher = {MIT Press}, organization = {Max-Planck-Gesellschaft}, school = {Biologische Kybernetik}, address = {Cambridge, MA, USA}, month = sep, year = {2007}, slug = {5371}, author = {Keerthi, SS. and Sindhwani, V. and Chapelle, O.}, month_numeric = {9} }