Recent approaches to independent component analysis have used kernel independence measures to obtain very good performance in ICA, particularly in areas where classical methods experience difficulty (for instance, sources with near-zero kurtosis). In this chapter, we compare two efficient extensions of these methods for large-scale problems: random subsampling of entries in the Gram matrices used in defining the independence measures, and incomplete Cholesky decomposition of these matrices. We derive closed-form, efficiently computable approximations for the gradients of these measures, and compare their performance on ICA using both artificial and music data. We show that kernel ICA can scale up to much larger problems than yet attempted, and that incomplete Cholesky decomposition performs better than random sampling.

Author(s): Jegelka, S. and Gretton, A.
Book Title: Large Scale Kernel Machines
Pages: 225-250
Year: 2007
Month: September
Day: 0
Series: Neural Information Processing
Editors: Bottou, L. , O. Chapelle, D. DeCoste, J. Weston
Publisher: MIT Press
Bibtex Type: Book Chapter (inbook)
Address: Cambridge, MA, USA
Digital: 0
Electronic Archiving: grant_archive
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inbook{4192,
  title = {Brisk Kernel ICA},
  booktitle = {Large Scale Kernel Machines},
  abstract = {Recent approaches to independent component analysis have used kernel
  independence measures to obtain very good performance in ICA, particularly
  in areas where classical methods experience difficulty (for instance,
  sources with near-zero kurtosis). In this chapter, we compare two efficient
  extensions of these methods for large-scale problems: random subsampling
  of entries in the Gram matrices used in defining the independence
  measures, and incomplete Cholesky decomposition of these matrices.
  We derive closed-form, efficiently computable approximations for the
  gradients of these measures, and compare their performance on ICA using
  both artificial and music data. We show that kernel ICA can scale up to much larger
  problems than yet attempted, and that incomplete Cholesky decomposition
  performs better than random sampling.},
  pages = {225-250},
  series = {Neural Information Processing},
  editors = {Bottou, L. , O. Chapelle, D. DeCoste, J. Weston},
  publisher = {MIT Press},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  address = {Cambridge, MA, USA},
  month = sep,
  year = {2007},
  slug = {4192},
  author = {Jegelka, S. and Gretton, A.},
  month_numeric = {9}
}