Manifold Denoising as Preprocessing for Finding Natural Representations of Data

Institute Homepage

Institute Homepage Sign In

Back

Empirical Inference Conference Paper 2007

A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.

Author(s):	Hein, M. and Maier, M.
Book Title:	AAAI-07
Journal:	Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)
Pages:	1646-1649
Year:	2007
Month:	July
Day:	0
Publisher:	AAAI Press

Bibtex Type:	Conference Paper (inproceedings)

Address:	Menlo Park, CA, USA
Event Name:	Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)
Event Place:	Vancouver, BC, Canada

Digital:	0
Electronic Archiving:	grant_archive
Institution:	Association for the Advancement of Artificial Intelligence
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	PDF Web

BibTex

@inproceedings{4588,
  title = {Manifold Denoising as Preprocessing for Finding Natural Representations of Data},
  journal = {Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)},
  booktitle = {AAAI-07},
  abstract = {A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just
  have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not
  change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure.
  This paper reviews a method called Manifold Denoising which projects
  the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate
  that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using
  the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.},
  pages = {1646-1649},
  publisher = {AAAI Press},
  organization = {Max-Planck-Gesellschaft},
  institution = {Association for the Advancement of Artificial Intelligence},
  school = {Biologische Kybernetik},
  address = {Menlo Park, CA, USA},
  month = jul,
  year = {2007},
  slug = {4588},
  author = {Hein, M. and Maier, M.},
  month_numeric = {7}
}