GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping

被引:66
作者
Alser, Mohammed [1 ]
Hassan, Hasan [2 ,3 ]
Xin, Hongyi [4 ]
Ergin, Oguz [2 ]
Mutlu, Onur [3 ]
Alkan, Can [1 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
[2] TOBB Univ Econ & Technol, Ankara, Turkey
[3] Swiss Fed Inst Technol, Dept Comp Sci, CH-8092 Zurich, Switzerland
[4] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
关键词
ACCURATE;
D O I
10.1093/bioinformatics/btx342
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments - called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. Results: We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10.
引用
收藏
页码:3355 / 3363
页数:9
相关论文
共 39 条
[21]  
Li Ming, 2004, J Bioinform Comput Biol, V2, P417, DOI 10.1142/S0219720004000661
[22]  
Lindner M.S, 2016, BIOINFORMATICS
[23]   SOAP3: ultra-fast GPU-based parallel alignment tool for short reads [J].
Liu, Chi-Man ;
Wong, Thomas ;
Wu, Edward ;
Luo, Ruibang ;
Yiu, Siu-Ming ;
Li, Yingrui ;
Wang, Bingqiang ;
Yu, Chang ;
Chu, Xiaowen ;
Zhao, Kaiyong ;
Li, Ruiqiang ;
Lam, Tak-Wah .
BIOINFORMATICS, 2012, 28 (06) :878-879
[24]   SOAP3-dp: Fast, Accurate and Sensitive GPU-Based Short Read Aligner [J].
Luo, Ruibang ;
Wong, Thomas ;
Zhu, Jianqiao ;
Liu, Chi-Man ;
Zhu, Xiaoqian ;
Wu, Edward ;
Lee, Lap-Kei ;
Lin, Haoxiang ;
Zhu, Wenjuan ;
Cheung, David W. ;
Ting, Hing-Fung ;
Yiu, Siu-Ming ;
Peng, Shaoliang ;
Yu, Chang ;
Li, Yingrui ;
Li, Ruiqiang ;
Lam, Tak-Wah .
PLOS ONE, 2013, 8 (05)
[25]  
Marco-Sola S, 2012, NAT METHODS, V9, P1185, DOI [10.1038/NMETH.2221, 10.1038/nmeth.2221]
[26]   A GENERAL METHOD APPLICABLE TO SEARCH FOR SIMILARITIES IN AMINO ACID SEQUENCE OF 2 PROTEINS [J].
NEEDLEMAN, SB ;
WUNSCH, CD .
JOURNAL OF MOLECULAR BIOLOGY, 1970, 48 (03) :443-+
[27]   Hardware Acceleration of Short Read Mapping [J].
Olson, Corey B. ;
Kim, Maria ;
Clauson, Cooper ;
Kogon, Boris ;
Ebeling, Carl ;
Hauck, Scott ;
Ruzzo, Walter L. .
2012 IEEE 20TH ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2012, :161-168
[28]   Efficient q-gram filters for finding all ε-matches over a given length [J].
Rasmussen, KR ;
Stoye, J ;
Myers, EW .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2006, 13 (02) :296-308
[29]   IDENTIFICATION OF COMMON MOLECULAR SUBSEQUENCES [J].
SMITH, TF ;
WATERMAN, MS .
JOURNAL OF MOLECULAR BIOLOGY, 1981, 147 (01) :195-197
[30]   Three Ages of FPGAs: A Retrospective on the First Thirty Years of FPGA Technology [J].
Trimberger, Stephen M. .
PROCEEDINGS OF THE IEEE, 2015, 103 (03) :318-331