Back
Sparse Gaussian Processes: inference, subspace identification and model selection
Gaussian Process (GP) inference is a probabilistic kernel method where the GP is treated as a latent function. The inference is carried out using the Bayesian online learning and its extension to the more general iterative approach which we call TAP/EP learning. Sparsity is introduced in this context to make the TAP/EP method applicable to large datasets. We address the prohibitive scaling of the number of parameters by defining a subset of the training data that is used as the support the GP, thus the number of required parameters is independent of the training set, similar to the case of ``Support--‘‘ or ``Relevance--Vectors‘‘. An advantage of the full probabilistic treatment is that allows the computation of the marginal data likelihood or evidence, leading to hyper-parameter estimation within the GP inference. An EM algorithm to choose the hyper-parameters is proposed. The TAP/EP learning is the E-step and the M-step then updates the hyper-parameters. Due to the sparse E-step the resulting algorithm does not involve manipulation of large matrices. The presented algorithm is applicable to a wide variety of likelihood functions. We present results of applying the algorithm on classification and nonstandard regression problems for artificial and real datasets.
@inproceedings{2610, title = {Sparse Gaussian Processes: inference, subspace identification and model selection}, journal = {Proceedings}, abstract = {Gaussian Process (GP) inference is a probabilistic kernel method where the GP is treated as a latent function. The inference is carried out using the Bayesian online learning and its extension to the more general iterative approach which we call TAP/EP learning. Sparsity is introduced in this context to make the TAP/EP method applicable to large datasets. We address the prohibitive scaling of the number of parameters by defining a subset of the training data that is used as the support the GP, thus the number of required parameters is independent of the training set, similar to the case of ``Support--‘‘ or ``Relevance--Vectors‘‘. An advantage of the full probabilistic treatment is that allows the computation of the marginal data likelihood or evidence, leading to hyper-parameter estimation within the GP inference. An EM algorithm to choose the hyper-parameters is proposed. The TAP/EP learning is the E-step and the M-step then updates the hyper-parameters. Due to the sparse E-step the resulting algorithm does not involve manipulation of large matrices. The presented algorithm is applicable to a wide variety of likelihood functions. We present results of applying the algorithm on classification and nonstandard regression problems for artificial and real datasets.}, pages = {1-6}, editors = {Van der Hof, , Wahlberg}, organization = {Max-Planck-Gesellschaft}, institution = {MPI for Biological Cybernetics, Tuebingen}, school = {Biologische Kybernetik}, address = {The Netherlands}, month = aug, year = {2003}, note = {electronical version; Index ThA02-2}, slug = {2610}, author = {Csato, L. and Opper, M.}, month_numeric = {8} }