An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models

Institute Homepage DE Sign In

Empirical Inference Conference Paper 2007

We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.

Author(s):	Keerthi, SS. and Sindhwani, V. and Chapelle, O.
Book Title:	Advances in Neural Information Processing Systems 19
Journal:	Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference
Pages:	673-680
Year:	2007
Month:	September
Day:	0
Editors:	Sch{\"o}lkopf, B. , J. Platt, T. Hofmann
Publisher:	MIT Press

Bibtex Type:	Conference Paper (inproceedings)

Address:	Cambridge, MA, USA
Event Name:	Twentieth Annual Conference on Neural Information Processing Systems (NIPS 2006)
Event Place:	Vancouver, BC, Canada

Digital:	0
Electronic Archiving:	grant_archive
ISBN:	0-262-19568-2
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	PDF Web

BibTex

@inproceedings{5371,
  title = {An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models},
  journal = {Advances in Neural Information Processing Systems 19: Proceedings of the 2006 Conference},
  booktitle = {Advances in Neural Information Processing Systems 19},
  abstract = {We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold cross-validation error, 
  using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations.},
  pages = {673-680},
  editors = {Sch{\"o}lkopf, B. , J. Platt, T. Hofmann},
  publisher = {MIT Press},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Cambridge, MA, USA},
  month = sep,
  year = {2007},
  slug = {5371},
  author = {Keerthi, SS. and Sindhwani, V. and Chapelle, O.},
  month_numeric = {9}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives