SWIFOLD: Smith-Waterman implementation on FPGA with OpenCL for long DNA sequences

被引:36
作者
Rucci, Enzo [1 ]
Garcia, Carlos [2 ]
Botella, Guillermo [2 ]
De Giusti, Armando [1 ]
Naiouf, Marcelo [3 ]
Prieto-Matias, Manuel [2 ]
机构
[1] Univ Nacl La Plata, Fac Informat, III LIDI, CONICET, RA-1900 La Plata, Buenos Aires, Argentina
[2] Univ Complutense Madrid, Dept Arquitectura Comp & Automat, E-28040 Madrid, Spain
[3] Univ Nacl La Plata, Fac Informat, III LIDI, RA-1900 La Plata, Buenos Aires, Argentina
关键词
DNA; Smith-Waterman; OpenCL; High-performance computing; FPGA;
D O I
10.1186/s12918-018-0614-6
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background: The Smith-Waterman (SW) algorithm is the best choice for searching similar regions between two DNA or protein sequences. However, it may become impracticable in some contexts due to its high computational demands. Consequently, the computer science community has focused on the use of modern parallel architectures such as Graphics Processing Units (GPUs), Xeon Phi accelerators and Field Programmable Gate Arrays (FGPAs) to speed up large-scale workloads. Results: This paper presents and evaluates SWIFOLD: a Smith-Waterman parallel Implementation on FPGA with OpenCL for Long DNA sequences. First, we evaluate its performance and resource usage for different kernel configurations. Next, we carry out a performance comparison between our tool and other state-of-the-art implementations considering three different datasets. SWIFOLD offers the best average performance for small and medium test sets, achieving a performance that is independent of input size and sequence similarity. In addition, SWIFOLD provides competitive performance rates in comparison with GPU-based implementations on the latest GPU generation for the large dataset. Conclusions: The results suggest that SWIFOLD can be a serious contender for accelerating the SW alignment of DNA sequences of unrestricted size in an affordable way reaching on average 125 GCUPS and almost a peak of 270 GCUPS.
引用
收藏
页数:11
相关论文
共 26 条
[1]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[2]   Fpga acceleration for DNA sequence alignment [J].
Caffarena, Gabriel ;
Pedreira, Carlos ;
Carreras, Carlos ;
Bojanic, Slobodan ;
Nieto-Taladriz, Octavio .
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2007, 16 (02) :245-266
[3]   Parasail: SIMD C library for global, semi-global, and local pairwise sequence alignments [J].
Daily, Jeff .
BMC BIOINFORMATICS, 2016, 16
[4]  
De O., 2016, ACM TRANS PARALLEL C, V2, P28, DOI [10.1145/2858656, DOI 10.1145/2858656]
[5]   CUDAlign 4.0: Incremental Speculative Traceback for Exact Chromosome-Wide Alignment in GPU Clusters [J].
de Oliveira Sandes, Edans Flavius ;
Miranda, Guillermo ;
Martorell, Xavier ;
Ayguade, Eduard ;
Teodoro, George ;
Magalhaes Melo, Alba Cristina .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (10) :2838-2850
[6]   AN IMPROVED ALGORITHM FOR MATCHING BIOLOGICAL SEQUENCES [J].
GOTOH, O .
JOURNAL OF MOLECULAR BIOLOGY, 1982, 162 (03) :705-708
[7]  
Intel, 2017, INTEL FPGA SDK FOROP
[8]   SW#-GPU-enabled exact alignments on genome scale [J].
Korpar, Matija ;
Sikic, Mile .
BIOINFORMATICS, 2013, 29 (19) :2494-2495
[9]  
Leopold G., 2016, Intels fpgas target datacenters, networking
[10]  
Liu Y, 2014, 2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), P184, DOI 10.1109/DSAA.2014.7058071