GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping

被引:66
作者
Alser, Mohammed [1 ]
Hassan, Hasan [2 ,3 ]
Xin, Hongyi [4 ]
Ergin, Oguz [2 ]
Mutlu, Onur [3 ]
Alkan, Can [1 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
[2] TOBB Univ Econ & Technol, Ankara, Turkey
[3] Swiss Fed Inst Technol, Dept Comp Sci, CH-8092 Zurich, Switzerland
[4] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
关键词
ACCURATE;
D O I
10.1093/bioinformatics/btx342
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments - called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. Results: We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10.
引用
收藏
页码:3355 / 3363
页数:9
相关论文
共 39 条
[1]   Hobbes: optimized gram-based methods for efficient read alignment [J].
Ahmadi, Athena ;
Behm, Alexander ;
Honnalli, Nagesh ;
Li, Chen ;
Weng, Lingjie ;
Xie, Xiaohui .
NUCLEIC ACIDS RESEARCH, 2012, 40 (06) :e41
[2]   Personalized copy number and segmental duplication maps using next-generation sequencing [J].
Alkan, Can ;
Kidd, Jeffrey M. ;
Marques-Bonet, Tomas ;
Aksay, Gozde ;
Antonacci, Francesca ;
Hormozdiari, Fereydoun ;
Kitzman, Jacob O. ;
Baker, Carl ;
Malig, Maika ;
Mutlu, Onur ;
Sahinalp, S. Cenk ;
Gibbs, Richard A. ;
Eichler, Evan E. .
NATURE GENETICS, 2009, 41 (10) :1061-U29
[3]   An integrated map of genetic variation from 1,092 human genomes [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Schmidt, Jeanette P. ;
Sherry, Stephen T. ;
Wang, Jun ;
Wilson, Richard K. ;
Gibbs, Richard A. ;
Dinh, Huyen ;
Kovar, Christie ;
Lee, Sandra ;
Lewis, Lora ;
Muzny, Donna ;
Reid, Jeff ;
Wang, Min ;
Wang, Jun ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Li, Zhuo ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Su, Zhe ;
Tai, Shuaishuai ;
Tang, Meifang .
NATURE, 2012, 491 (7422) :56-65
[4]   A Review of Hardware Acceleration for Computational Genomics [J].
Aluru, Srinivas ;
Jammula, Nagakishore .
IEEE DESIGN & TEST, 2014, 31 (01) :19-30
[5]  
[Anonymous], INT C EMB COMP SYST
[6]   Reconfigurable Acceleration of Short Read Mapping [J].
Arram, James ;
Tsoi, Kuen Hung ;
Luk, Wayne ;
Jiang, Peiyong .
2013 IEEE 21ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2013, :210-217
[7]  
Canzar S, 2015, P IEEE, P1
[8]   BitMapper: an efficient all-mapper based on bit-vector computing [J].
Cheng, Haoyu ;
Jiang, Huaipan ;
Yang, Jiaoyun ;
Xu, Yun ;
Shang, Yi .
BMC BIOINFORMATICS, 2015, 16
[9]   SHRiMP2: Sensitive yet Practical Short Read Mapping [J].
David, Matei ;
Dzamba, Misko ;
Lister, Dan ;
Ilie, Lucian ;
Brudno, Michael .
BIOINFORMATICS, 2011, 27 (07) :1011-1012
[10]  
Hach F., 2014, NUCLEIC ACIDS RES