Empirical Inference Book Chapter 2009

Covariate shift and local learning by distribution matching

Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing kernel Hilbert space). This approach does not require distribution estimation. Instead, the sample weights are obtained by a simple quadratic programming procedure. We provide a uniform convergence bound on the distance between the reweighted training feature mean and the test feature mean, a transductive bound on the expected loss of an algorithm trained on the reweighted data, and a connection to single class SVMs. While our method is designed to deal with the case of simple covariate shift (in the sense of Chapter ??), we have also found benefits for sample selection bias on the labels. Our correction procedure yields its greatest and most consistent advantages when the learning algorithm returns a classifier/regressor that is simpler" than the data might suggest.

Author(s): Gretton, A. and Smola, AJ. and Huang, J. and Schmittfull, M. and Borgwardt, KM. and Schölkopf, B.
Book Title: Dataset Shift in Machine Learning
Pages: 131-160
Year: 2009
Day: 0
Editors: Quiñonero-Candela, J., Sugiyama, M., Schwaighofer, A. and Lawrence, N. D.
Publisher: MIT Press
Bibtex Type: Book Chapter (inbook)
Address: Cambridge, MA, USA
Electronic Archiving: grant_archive
ISBN: 978-0-262-17005-5
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inbook{5376,
  title = {Covariate shift and local learning by distribution matching},
  booktitle = {Dataset Shift in Machine Learning},
  abstract = {Given sets of observations of training and test data, we consider the problem of re-weighting the training data such that its distribution more closely matches that of the test data. We achieve this goal by matching covariate distributions between training and test sets in a high dimensional feature space (specifically, a reproducing
  kernel Hilbert space). This approach does not require distribution estimation. Instead, the sample weights are obtained by a simple quadratic programming procedure. We provide a uniform convergence bound on the distance between
  the reweighted training feature mean and the test feature mean, a transductive bound on the expected loss of an algorithm trained on the reweighted data, and a connection to single class SVMs. While our method is designed to deal with the case of simple covariate shift (in the sense of Chapter ??), we have also found benefits for sample selection bias on the labels. Our correction procedure yields its greatest and most consistent advantages when the learning algorithm returns a classifier/regressor that is simpler" than the data might suggest.},
  pages = {131-160},
  editors = {Quiñonero-Candela, J., Sugiyama, M., Schwaighofer, A. and Lawrence, N. D.},
  publisher = {MIT Press},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Cambridge, MA, USA},
  year = {2009},
  slug = {5376},
  author = {Gretton, A. and Smola, AJ. and Huang, J. and Schmittfull, M. and Borgwardt, KM. and Sch{\"o}lkopf, B.}
}