Positional Oligomer Importance Matrices

Institute Homepage DE Sign In

Back

Empirical Inference Talk 2007

At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences, above all of DNA and proteins. In many cases, the most accurate classifiers are obtained by training SVMs with complex sequence kernels, for instance for transcription starts or splice sites. However, an often criticized downside of SVMs with complex kernels is that it is very hard for humans to understand the learned decision rules and to derive biological insights from them. To close this gap, we introduce the concept of positional oligomer importance matrices (POIMs) and develop an efficient algorithm for their computation. We demonstrate how they overcome the limitations of sequence logos, and how they can be used to find relevant motifs for different biological phenomena in a straight-forward way. Note that the concept of POIMs is not limited to interpreting SVMs, but is applicable to general k&#8722;mer based scoring systems.

Author(s):	Sonnenburg, S. and Zien, A. and Philips, P. and Rätsch, G.
Year:	2007
Month:	December
Day:	0

Bibtex Type:	Talk (talk)

Digital:	0
Electronic Archiving:	grant_archive
Event Name:	NIPS 2007 Workshop on Machine Learning in Computational Biology
Event Place:	Whistler, BC, Canada
Language:	en
Organization:	Max-Planck-Gesellschaft
School:	Biologische Kybernetik

Links:	Web

BibTex

@talk{5033,
  title = {Positional Oligomer Importance Matrices},
  abstract = {At the heart of many important bioinformatics problems, such as gene finding and function prediction, is the classification of biological sequences, above all of DNA and proteins. In many cases, the most accurate classifiers are obtained by training SVMs with complex sequence kernels, for instance for transcription starts or splice sites. However, an often criticized downside of SVMs with complex kernels is that it is very hard for humans to understand the learned decision rules and to derive biological insights from them. To close this gap, we introduce the concept of positional oligomer importance matrices (POIMs) and develop an efficient algorithm for their computation. We demonstrate how they overcome the limitations of sequence logos, and how they can be used to find relevant motifs for different biological phenomena in a straight-forward way. Note that the concept of POIMs is not limited to interpreting SVMs, but is applicable to general k&amp;#8722;mer based scoring systems.},
  organization = {Max-Planck-Gesellschaft},
  school = {Biologische Kybernetik},
  month = dec,
  year = {2007},
  slug = {5033},
  author = {Sonnenburg, S. and Zien, A. and Philips, P. and R{\"a}tsch, G.},
  month_numeric = {12}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives