Empirische Inferenz Conference Paper 2007

Manifold Denoising as Preprocessing for Finding Natural Representations of Data

A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.

Author(s): Hein, M. and Maier, M.
Book Title: AAAI-07
Journal: Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)
Pages: 1646-1649
Year: 2007
Month: July
Day: 0
Publisher: AAAI Press
Bibtex Type: Conference Paper (inproceedings)
Address: Menlo Park, CA, USA
Event Name: Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)
Event Place: Vancouver, BC, Canada
Digital: 0
Electronic Archiving: grant_archive
Institution: Association for the Advancement of Artificial Intelligence
Language: en
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inproceedings{4588,
  title = {Manifold Denoising as Preprocessing for Finding Natural Representations of Data},
  journal = {Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)},
  booktitle = {AAAI-07},
  abstract = {A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just
  have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not
  change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure.
  This paper reviews a method called Manifold Denoising which projects
  the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate
  that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using
  the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.},
  pages = {1646-1649},
  publisher = {AAAI Press},
  organization = {Max-Planck-Gesellschaft},
  institution = {Association for the Advancement of Artificial Intelligence},
  school = {Biologische Kybernetik},
  address = {Menlo Park, CA, USA},
  month = jul,
  year = {2007},
  slug = {4588},
  author = {Hein, M. and Maier, M.},
  month_numeric = {7}
}