Two-locus association mapping in subquadratic time

Institute Homepage

Institute Homepage DE Sign In

Back

Empirical Inference Conference Paper 2011

Director

Genome-wide association studies (GWAS) have not been able to discover strong associations between many complex human diseases and single genetic loci. Mapping these phenotypes to pairs of genetic loci is hindered by the huge number of candidates leading to enormous computational and statistical problems. In GWAS on single nucleotide polymorphisms (SNPs), one has to consider in the order of 1010 to 1014 pairs, which is infeasible in practice. In this article, we give the first algorithm for 2-locus genome-wide association studies that is subquadratic in the number, n, of SNPs. The running time of our algorithm is data-dependent, but large experiments over real genomic data suggest that it scales empirically as n3/2. As a result, our algorithm can easily cope with n ~ 107, i.e., it can efficiently search all pairs of SNPs in the human genome.

Author(s):	Achlioptas, P. and Schölkopf, B. and Borgwardt, K.
Pages:	726-734
Year:	2011
Month:	August
Day:	0
Editors:	C Apté and J Ghosh and P Smyth
Publisher:	ACM Press

Bibtex Type:	Conference Paper (inproceedings)

Address:	New York, NY, USA
DOI:	10.1145/2020408.2020521
Event Name:	17th ACM SIGKKD Conference on Knowledge Discovery and Data Mining (KDD 2011)
Event Place:	San Diego, CA, USA

Digital:	0
Electronic Archiving:	grant_archive
ISBN:	978-1-4503-0813-7

Links:	Web

BibTex

@inproceedings{Borgwardt2011,
  title = {Two-locus association mapping in subquadratic time},
  abstract = {Genome-wide association studies (GWAS) have not been able to discover strong associations between many complex human diseases and single genetic loci. Mapping these phenotypes to pairs of genetic loci is hindered by the huge number of candidates leading to enormous computational and statistical problems. In GWAS on single nucleotide polymorphisms (SNPs), one has to consider in the order of 1010 to 1014 pairs, which is infeasible in practice. In this article, we give the first algorithm for 2-locus genome-wide association studies that is subquadratic in the number, n, of SNPs. The running time of our algorithm is data-dependent, but large experiments over real genomic data suggest that it scales empirically as n3/2. As a result, our algorithm can easily cope with n ~ 107, i.e., it can efficiently search all pairs of SNPs in the human genome.},
  pages = {726-734},
  editors = {C Apté and J Ghosh and P Smyth},
  publisher = {ACM Press},
  address = {New York, NY, USA},
  month = aug,
  year = {2011},
  slug = {borgwardt2011},
  author = {Achlioptas, P. and Sch{\"o}lkopf, B. and Borgwardt, K.},
  month_numeric = {8}
}

Research

Departments

Research Groups

People

Contact

Our Institute

Our History

Career

Doctoral Programs

Training

Service Units

Central Scientific Facilities

Workshops

Campus Services

Impact

Cooperation

Partners and Initiatives