Protein alignment algorithms with an efficient backtracking routine on multiple GPUs

被引:49
作者
Blazewicz, Jacek [1 ,2 ]
Frohmberg, Wojciech [1 ]
Kierzynka, Michal [1 ,4 ]
Pesch, Erwin [3 ]
Wojciechowski, Pawel [1 ]
机构
[1] Poznan Univ Tech, Poznan, Poland
[2] Inst Bioorgan Chem PAS, Poznan, Poland
[3] Univ Siegen, Siegen, Germany
[4] Poznan Supercomp & Networking Ctr, Poznan, Poland
关键词
SMITH-WATERMAN ALGORITHM; SEQUENCE ALIGNMENT; ACCURATE;
D O I
10.1186/1471-2105-12-181
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Pairwise sequence alignment methods are widely used in biological research. The increasing number of sequences is perceived as one of the upcoming challenges for sequence alignment methods in the nearest future. To overcome this challenge several GPU (Graphics Processing Unit) computing approaches have been proposed lately. These solutions show a great potential of a GPU platform but in most cases address the problem of sequence database scanning and computing only the alignment score whereas the alignment itself is omitted. Thus, the need arose to implement the global and semiglobal Needleman-Wunsch, and Smith-Waterman algorithms with a backtracking procedure which is needed to construct the alignment. Results: In this paper we present the solution that performs the alignment of every given sequence pair, which is a required step for progressive multiple sequence alignment methods, as well as for DNA recognition at the DNA assembly stage. Performed tests show that the implementation, with performance up to 6.3 GCUPS on a single GPU for affine gap penalties, is very efficient in comparison to other CPU and GPU-based solutions. Moreover, multiple GPUs support with load balancing makes the application very scalable. Conclusions: The article shows that the backtracking procedure of the sequence alignment algorithms may be designed to fit in with the GPU architecture. Therefore, our algorithm, apart from scores, is able to compute pairwise alignments. This opens a wide range of new possibilities, allowing other methods from the area of molecular biology to take advantage of the new computational architecture. Performed tests show that the efficiency of the implementation is excellent. Moreover, the speed of our GPU-based algorithms can be almost linearly increased when using more than one graphics card.
引用
收藏
页数:17
相关论文
共 33 条
[1]  
Blazewicz J., 2007, Handbook on Scheduling: From Theory to Applications
[2]   Parallel implementation of the novel approach to genome assembly [J].
Blazewicz, Jacek ;
Kasprzak, Marta ;
Swiercz, Aleksandra ;
Figlerowicz, Marek ;
Gawron, Piotr ;
Platt, Darren ;
Szajkowski, Lukasz .
PROCEEDINGS OF NINTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING, 2008, :732-+
[3]   Whole genome assembly from 454 sequencing output via modified DNA graph concept [J].
Blazewicz, Jacek ;
Bryja, Marcin ;
Figlerowicz, Marek ;
Gawron, Piotr ;
Kasprzak, Marta ;
Kirton, Edward ;
Platt, Darren ;
Przybytek, Jakub ;
Swiercz, Aleksandra ;
Szajkowski, Lukasz .
COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2009, 33 (03) :224-230
[4]  
Dayhoff M O., 1978, Atlas of Protein Seq Struct, ppp 345
[5]  
Deonier R.C., 2005, COMPUTATIONAL GENOME
[6]   ProbCons: Probabilistic consistency-based multiple sequence alignment [J].
Do, CB ;
Mahabhashyam, MSP ;
Brudno, M ;
Batzoglou, S .
GENOME RESEARCH, 2005, 15 (02) :330-340
[7]  
Farrar M., OPTIMIZING SMITH WAT
[8]   Striped Smith-Waterman speeds database searches six times over other SIMD implementations [J].
Farrar, Michael .
BIOINFORMATICS, 2007, 23 (02) :156-161
[9]   AN IMPROVED ALGORITHM FOR MATCHING BIOLOGICAL SEQUENCES [J].
GOTOH, O .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (03) :705-708
[10]   BOUNDS ON MULTIPROCESSING TIMING ANOMALIES [J].
GRAHAM, RL .
SIAM JOURNAL ON APPLIED MATHEMATICS, 1969, 17 (02) :416-&