A Maximum Entropy Approach to Semi-supervised Learning

Institute Homepage

Institute Homepage DE Sign In

Empirical Inference Poster 2010

Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the goal is to find a distribution p that maximizes an entropy function while enforcing data constraints so that the expected values of some (pre-defined) features with respect to p match their empirical counterparts approximately. Using different entropy measures, different model spaces for p and different approximation criteria for the data constraints yields a family of discriminative supervised learning methods (e.g., logistic regression, conditional random fields, least squares and boosting). This framework is known as the generalized maximum entropy framework. Semi-supervised learning (SSL) has emerged in the last decade as a promising field that combines unlabeled data along with labeled data so as to increase the accuracy and robustness of inference algorithms. However, most SSL algorithms to date have had trade-offs, e.g., in terms of scalability or applicability to multi-categorical data. We extend the generalized MaxEnt framework to develop a family of novel SSL algorithms. Extensive empirical evaluation on benchmark data sets that are widely used in the literature demonstrates the validity and competitiveness of the proposed algorithms.

Author(s):	Erkan, AN. and Altun, Y.
Journal:	30th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering (MaxEnt 2010)
Volume:	30
Pages:	80
Year:	2010
Month:	July
Day:	0

Bibtex Type:	Poster (poster)

Digital:	0
Electronic Archiving:	grant_archive
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	PDF PDF

BibTex

@poster{6747,
  title = {A Maximum Entropy Approach to Semi-supervised Learning},
  journal = {30th International Workshop on Bayesian Inference and Maximum Entropy Methods in Science and Engineering (MaxEnt 2010)},
  abstract = {Maximum entropy (MaxEnt) framework has been studied extensively in supervised
  learning. Here, the goal is to find a distribution p that maximizes an entropy function
  while enforcing data constraints so that the expected values of some (pre-defined) features
  with respect to p match their empirical counterparts approximately. Using different
  entropy measures, different model spaces for p and different approximation criteria
  for the data constraints yields a family of discriminative supervised learning methods
  (e.g., logistic regression, conditional random fields, least squares and boosting). This
  framework is known as the generalized maximum entropy framework.
  Semi-supervised learning (SSL) has emerged in the last decade as a promising field
  that combines unlabeled data along with labeled data so as to increase the accuracy and
  robustness of inference algorithms. However, most SSL algorithms to date have had
  trade-offs, e.g., in terms of scalability or applicability to multi-categorical data. We
  extend the generalized MaxEnt framework to develop a family of novel SSL algorithms.
  Extensive empirical evaluation on benchmark data sets that are widely used in
  the literature demonstrates the validity and competitiveness of the proposed algorithms.},
  volume = {30},
  pages = {80},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  month = jul,
  year = {2010},
  slug = {6747},
  author = {Erkan, AN. and Altun, Y.},
  month_numeric = {7}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives