Empirical Inference Conference Paper 2005

An Analysis of the Anti-Learning Phenomenon for the Class Symmetric Polyhedron

This paper deals with an unusual phenomenon where most machine learning algorithms yield good performance on the training set but systematically worse than random performance on the test set. This has been observed so far for some natural data sets and demonstrated for some synthetic data sets when the classification rule is learned from a small set of training samples drawn from some high dimensional space. The initial analysis presented in this paper shows that anti-learning is a property of data sets and is quite distinct from overfitting of a training data. Moreover, the analysis leads to a specification of some machine learning procedures which can overcome anti-learning and generate ma- chines able to classify training and test data consistently.

Author(s): Kowalczyk, A. and Chapelle, O.
Journal: Algorithmic Learning Theory: 16th International Conference
Pages: 78-92
Year: 2005
Month: October
Day: 0
Bibtex Type: Conference Paper (inproceedings)
Event Name: Algorithmic Learning Theory
Digital: 0
Electronic Archiving: grant_archive
Organization: Max-Planck-Gesellschaft
School: Biologische Kybernetik
Links:

BibTex

@inproceedings{3604,
  title = {An Analysis of the Anti-Learning Phenomenon for the Class Symmetric Polyhedron},
  journal = {Algorithmic Learning Theory: 16th International Conference},
  abstract = {This paper deals with an unusual phenomenon where most machine learning algorithms yield good performance on the training set but systematically worse than random performance on the test set. This has been observed so far for some natural data sets and demonstrated for some synthetic data sets when the classification rule is learned from a small set of training samples drawn from some high dimensional space. The initial analysis presented in this paper shows that anti-learning is a property of data sets and is quite distinct from overfitting of a training data. Moreover, the analysis leads to a specification of some machine learning procedures which can overcome anti-learning and generate ma- chines able to classify training and test data consistently.},
  pages = {78-92},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  month = oct,
  year = {2005},
  slug = {3604},
  author = {Kowalczyk, A. and Chapelle, O.},
  month_numeric = {10}
}