An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models

Institute Homepage

Institute Homepage Sign In

Empirical Inference Conference Paper 2007

We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.

Author(s):	Keerthi, SS. and Sindhwani, V. and Chapelle, O.
Book Title:	Advances in Neural Information Processing Systems 19
Journal:	Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference
Pages:	673-680
Year:	2007
Month:	September
Day:	0
Editors:	Sch{\"o}lkopf, B. , J. Platt, T. Hofmann
Publisher:	MIT Press

Bibtex Type:	Conference Paper (inproceedings)

Address:	Cambridge, MA, USA
Event Name:	Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006)
Event Place:	Vancouver, BC, Canada

Digital:	0
Electronic Archiving:	grant_archive
ISBN:	0-262-19568-2
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	PDF Web

BibTex

@inproceedings{5371,
  title = {An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models},
  journal = {Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference},
  booktitle = {Advances in Neural Information Processing Systems 19},
  abstract = {We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, 
  using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.},
  pages = {673-680},
  editors = {Sch{\"o}lkopf, B. , J. Platt, T. Hofmann},
  publisher = {MIT Press},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Cambridge, MA, USA},
  month = sep,
  year = {2007},
  slug = {5371},
  author = {Keerthi, SS. and Sindhwani, V. and Chapelle, O.},
  month_numeric = {9}
}