UPC plus plus for Bioinformatics: A Case Study Using Genome-Wide Association Studies

被引:0
作者
Kaessens, Jan C. [1 ]
Gonzalez-Dominguez, Jorge [2 ]
Wienbrandt, Lars [1 ]
Schmidt, Bertil [2 ]
机构
[1] Christian Albrechts Univ Kiel, Dept Comp Sci, Kiel, Germany
[2] Johannes Gutenberg Univ Mainz, Parallel & Distributed Architectures Grp, Mainz, Germany
来源
2014 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER) | 2014年
基金
英国惠康基金;
关键词
PGAS; UPC plus; GWAS; Bioinformatics; EPISTATIC INTERACTION DETECTION; GENE-GENE INTERACTIONS; CANCER SUSCEPTIBILITY; SNP INTERACTIONS; PAIR; TOOL;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Modern genotyping technologies are able to obtain up to a few million genetic markers (such as SNPs) of an individual within a few minutes of time. Detecting epistasis, such as SNP-SNP interactions, in Genome-Wide Association Studies is an important but time-consuming operation since statistical computations have to be performed for each pair of measured markers. Therefore, a variety of HPC architectures have been used to accelerate these studies. In this work we present a parallel approach for multi-core clusters, which is implemented with UPC++ and takes advantage of the features available in the Partitioned Global Address Space and Object Oriented Programming models. Our solution is based on a well-known regression model (used by the popular BOOST tool) to test SNP-pairs interactions. Experimental results show that UPC++ is suitable for parallelizing data-intensive bioinformatics applications on clusters. For instance, it reduces the time to analyze a real-world dataset with more than 500,000 SNPs and 5,000 individuals from several days when using a single core to less than one minute using 512 nodes (12,288 cores) of a Cray XC30 supercomputer.
引用
收藏
页码:248 / 256
页数:9
相关论文
共 50 条
  • [1] Bioinformatics challenges for genome-wide association studies
    Moore, Jason H.
    Asselbergs, Folkert W.
    Williams, Scott M.
    BIOINFORMATICS, 2010, 26 (04) : 445 - 455
  • [2] Manhattan plus plus : displaying genome-wide association summary statistics with multiple annotation layers
    Grace, Christopher
    Farrall, Martin
    Watkins, Hugh
    Goel, Anuj
    BMC BIOINFORMATICS, 2019, 20 (01)
  • [3] Genome-Wide Association Studies—Data Generation, Storage, Interpretation, and Bioinformatics
    Guillaume Pare
    Journal of Cardiovascular Translational Research, 2010, 3 : 183 - 188
  • [4] Genome-Wide Association Studies-Data Generation, Storage, Interpretation, and Bioinformatics
    Pare, Guillaume
    JOURNAL OF CARDIOVASCULAR TRANSLATIONAL RESEARCH, 2010, 3 (03) : 183 - 188
  • [5] Risk Prediction Using Genome-Wide Association Studies
    Kooperberg, Charles
    LeBlanc, Michael
    Obenchain, Valerie
    GENETIC EPIDEMIOLOGY, 2010, 34 (07) : 643 - 652
  • [6] Trio Genome-Wide Association Studies
    Badini, Isabella
    Davies, Neil
    BEHAVIOR GENETICS, 2024, 54 (06) : 548 - 549
  • [7] Genome-Wide Association Studies in Atherosclerosis
    S. Sivapalaratnam
    M. M. Motazacker
    S. Maiwald
    G. K. Hovingh
    J. J. P. Kastelein
    M. Levi
    M. D. Trip
    G. M. Dallinga-Thie
    Current Atherosclerosis Reports, 2011, 13 : 225 - 232
  • [8] TASUKE plus : a web-based platform for exploring genome-wide association studies results and large-scale resequencing data
    Kumagai, Masahiko
    Nishikawa, Daiki
    Kawahara, Yoshihiro
    Wakimoto, Hironobu
    Itoh, Ryutaro
    Tabei, Norio
    Tanaka, Tsuyoshi
    Itoh, Takeshi
    DNA RESEARCH, 2019, 26 (06) : 445 - 452
  • [9] Genome-Wide Association Studies of Autism
    Glessner J.T.
    Connolly J.J.
    Hakonarson H.
    Current Behavioral Neuroscience Reports, 2014, 1 (4) : 234 - 241
  • [10] Genome-Wide Association Studies in Atherosclerosis
    Sivapalaratnam, S.
    Motazacker, M. M.
    Maiwald, S.
    Hovingh, G. K.
    Kastelein, J. J. P.
    Levi, M.
    Trip, M. D.
    Dallinga-Thie, G. M.
    CURRENT ATHEROSCLEROSIS REPORTS, 2011, 13 (03) : 225 - 232