Large-Scale Pairwise Sequence Alignments on a Large-Scale GPU Cluster

被引:2
作者
Savran, Ibrahim [1 ]
Gao, Yang [1 ]
Bakos, Jason D. [1 ]
机构
[1] Univ S Carolina, Columbia, SC 29208 USA
基金
美国国家科学基金会;
关键词
COMMUNITIES; DIVERSITY; SEARCH; NOISE; RNA;
D O I
10.1109/MDAT.2013.2290116
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Partha Pratim Pande from Washington State University presents design of a GPU kernel for performing pairwise sequence alignments for large-scale short sequence datasets generated by next-generation sequencers. He describes a graphics processing unit (GPU) kernel that performs batch NeedlemanWunsch (NW) global alignments where the kernel returns an alignment score divided by the total alignment length for each alignment. The kernel is scalable when used with its MPI-based host software and is capable of achieving high-throughput alignment when run on a CPUGPU cluster. The host software includes a load balancing technique for data sets having sequences of nonuniform lengths. GPUs exploit data-level parallelism by interleaving instructions across a large set of active threads, while CPUs exploit instruction-level parallelism by maximizing the number of in-flight instructions from each thread.
引用
收藏
页码:51 / 61
页数:11
相关论文
共 27 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
Amber H., 2010, BMC BIOINFORMATICS, V11, DOI [10.1186/1471-2105-11-317, DOI 10.1186/1471-2105-11-317]
[3]   Ancient origins determine global biogeography of hot and cold desert cyanobacteria [J].
Bahl, Justin ;
Lau, Maggie C. Y. ;
Smith, Gavin J. D. ;
Vijaykrishna, Dhanasekaran ;
Cary, S. Craig ;
Lacap, Donnabella C. ;
Lee, Charles K. ;
Papke, R. Thane ;
Warren-Rhodes, Kimberley A. ;
Wong, Fiona K. Y. ;
McKay, Christopher P. ;
Pointing, Stephen B. .
NATURE COMMUNICATIONS, 2011, 2
[4]   Unraveling assembly of stream biofilm communities [J].
Besemer, Katharina ;
Peter, Hannes ;
Logue, Jurg B. ;
Langenheder, Silke ;
Lindstrom, Eva S. ;
Tranvik, Lars J. ;
Battin, Tom J. .
ISME JOURNAL, 2012, 6 (08) :1459-1468
[5]   QIIME allows analysis of high-throughput community sequencing data [J].
Caporaso, J. Gregory ;
Kuczynski, Justin ;
Stombaugh, Jesse ;
Bittinger, Kyle ;
Bushman, Frederic D. ;
Costello, Elizabeth K. ;
Fierer, Noah ;
Pena, Antonio Gonzalez ;
Goodrich, Julia K. ;
Gordon, Jeffrey I. ;
Huttley, Gavin A. ;
Kelley, Scott T. ;
Knights, Dan ;
Koenig, Jeremy E. ;
Ley, Ruth E. ;
Lozupone, Catherine A. ;
McDonald, Daniel ;
Muegge, Brian D. ;
Pirrung, Meg ;
Reeder, Jens ;
Sevinsky, Joel R. ;
Tumbaugh, Peter J. ;
Walters, William A. ;
Widmann, Jeremy ;
Yatsunenko, Tanya ;
Zaneveld, Jesse ;
Knight, Rob .
NATURE METHODS, 2010, 7 (05) :335-336
[6]   pGraph: Efficient Parallel Construction of Large-Scale Protein Sequence Homology Graphs [J].
Wu, Changjun ;
Kalyanaraman, Ananth ;
Cannon, William R. .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2012, 23 (10) :1923-1933
[7]   The Ribosomal Database Project: improved alignments and new tools for rRNA analysis [J].
Cole, J. R. ;
Wang, Q. ;
Cardenas, E. ;
Fish, J. ;
Chai, B. ;
Farris, R. J. ;
Kulam-Syed-Mohideen, A. S. ;
McGarrell, D. M. ;
Marsh, T. ;
Garrity, G. M. ;
Tiedje, J. M. .
NUCLEIC ACIDS RESEARCH, 2009, 37 :D141-D145
[8]   GPU Acceleration of Pyrosequencing Noise Removal [J].
Gao, Yang ;
Bakos, Jason D. .
2012 SYMPOSIUM ON APPLICATION ACCELERATORS IN HIGH PERFORMANCE COMPUTING (SAAHPC), 2012, :94-101
[9]   Multivariate Cutoff Level Analysis (MultiCoLA) of large community data sets [J].
Gobet, Angelique ;
Quince, Christopher ;
Ramette, Alban .
NUCLEIC ACIDS RESEARCH, 2010, 38 (15) :e155-e155
[10]   Effects of resource chemistry on the composition and function of stream hyporheic biofilms [J].
Hall, E. K. ;
Besemer, K. ;
Kohl, L. ;
Preiler, C. ;
Riedel, K. ;
Schneider, T. ;
Wanek, W. ;
Battin, T. J. .
FRONTIERS IN MICROBIOLOGY, 2012, 3