Empirical Inference Article 2008

Information Consistency of Nonparametric Gaussian Process Methods

Abstract—Bayesian nonparametric models are widely and successfully used for statistical prediction. While posterior consistency properties are well studied in quite general settings, results have been proved using abstract concepts such as metric entropy, and they come with subtle conditions which are hard to validate and not intuitive when applied to concrete models. Furthermore, convergence rates are difficult to obtain. By focussing on the concept of information consistency for Bayesian Gaussian process (GP)models, consistency results and convergence rates are obtained via a regret bound on cumulative log loss. These results depend strongly on the covariance function of the prior process, thereby giving a novel interpretation to penalization with reproducing kernel Hilbert space norms and to commonly used covariance function classes and their parameters. The proof of the main result employs elementary convexity arguments only. A theorem of Widom is used in order to obtain precise convergence rates for several covariance functions widely used in practice.

Author(s): Seeger, MW. and Kakade, SM. and Foster, DP.
Journal: IEEE Transactions on Information Theory
Volume: 54
Number (issue): 5
Pages: 2376-2382
Year: 2008
Month: May
Day: 0
Bibtex Type: Article (article)
DOI: 10.1109/TIT.2007.915707
Digital: 0
Electronic Archiving: grant_archive
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@article{4170,
  title = {Information Consistency of Nonparametric Gaussian Process Methods},
  journal = {IEEE Transactions on Information Theory},
  abstract = {Abstract—Bayesian nonparametric models are widely and successfully
  used for statistical prediction. While posterior consistency properties are
  well studied in quite general settings, results have been proved using abstract
  concepts such as metric entropy, and they come with subtle conditions
  which are hard to validate and not intuitive when applied to concrete
  models. Furthermore, convergence rates are difficult to obtain.
  By focussing on the concept of information consistency for Bayesian
  Gaussian process (GP)models, consistency results and convergence rates
  are obtained via a regret bound on cumulative log loss. These results
  depend strongly on the covariance function of the prior process, thereby
  giving a novel interpretation to penalization with reproducing kernel
  Hilbert space norms and to commonly used covariance function classes
  and their parameters. The proof of the main result employs elementary
  convexity arguments only. A theorem of Widom is used in order to obtain
  precise convergence rates for several covariance functions widely used in
  practice.},
  volume = {54},
  number = {5},
  pages = {2376-2382},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  month = may,
  year = {2008},
  slug = {4170},
  author = {Seeger, MW. and Kakade, SM. and Foster, DP.},
  month_numeric = {5}
}