Manifold Denoising as Preprocessing for Finding Natural Representations of Data
A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.
Author(s): | Hein, M. and Maier, M. |
Book Title: | AAAI-07 |
Journal: | Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07) |
Pages: | 1646-1649 |
Year: | 2007 |
Month: | July |
Day: | 0 |
Publisher: | AAAI Press |
Bibtex Type: | Conference Paper (inproceedings) |
Address: | Menlo Park, CA, USA |
Event Name: | Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07) |
Event Place: | Vancouver, BC, Canada |
Digital: | 0 |
Electronic Archiving: | grant_archive |
Institution: | Association for the Advancement of Artificial Intelligence |
Language: | en |
Organization: | Max-Planck-Gesellschaft |
School: | Biologische Kybernetik |
Links: |
BibTex
@inproceedings{4588, title = {Manifold Denoising as Preprocessing for Finding Natural Representations of Data}, journal = {Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07)}, booktitle = {AAAI-07}, abstract = {A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.}, pages = {1646-1649}, publisher = {AAAI Press}, organization = {Max-Planck-Gesellschaft}, institution = {Association for the Advancement of Artificial Intelligence}, school = {Biologische Kybernetik}, address = {Menlo Park, CA, USA}, month = jul, year = {2007}, slug = {4588}, author = {Hein, M. and Maier, M.}, month_numeric = {7} }