GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping

被引:64
作者
Alser, Mohammed [1 ]
Hassan, Hasan [2 ,3 ]
Xin, Hongyi [4 ]
Ergin, Oguz [2 ]
Mutlu, Onur [3 ]
Alkan, Can [1 ]
机构
[1] Bilkent Univ, Dept Comp Engn, TR-06800 Ankara, Turkey
[2] TOBB Univ Econ & Technol, Ankara, Turkey
[3] Swiss Fed Inst Technol, Dept Comp Sci, CH-8092 Zurich, Switzerland
[4] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA
关键词
ACCURATE;
D O I
10.1093/bioinformatics/btx342
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments - called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. Results: We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10.
引用
收藏
页码:3355 / 3363
页数:9
相关论文
共 39 条
  • [1] Hobbes: optimized gram-based methods for efficient read alignment
    Ahmadi, Athena
    Behm, Alexander
    Honnalli, Nagesh
    Li, Chen
    Weng, Lingjie
    Xie, Xiaohui
    [J]. NUCLEIC ACIDS RESEARCH, 2012, 40 (06) : e41
  • [2] Personalized copy number and segmental duplication maps using next-generation sequencing
    Alkan, Can
    Kidd, Jeffrey M.
    Marques-Bonet, Tomas
    Aksay, Gozde
    Antonacci, Francesca
    Hormozdiari, Fereydoun
    Kitzman, Jacob O.
    Baker, Carl
    Malig, Maika
    Mutlu, Onur
    Sahinalp, S. Cenk
    Gibbs, Richard A.
    Eichler, Evan E.
    [J]. NATURE GENETICS, 2009, 41 (10) : 1061 - U29
  • [3] An integrated map of genetic variation from 1,092 human genomes
    Altshuler, David M.
    Durbin, Richard M.
    Abecasis, Goncalo R.
    Bentley, David R.
    Chakravarti, Aravinda
    Clark, Andrew G.
    Donnelly, Peter
    Eichler, Evan E.
    Flicek, Paul
    Gabriel, Stacey B.
    Gibbs, Richard A.
    Green, Eric D.
    Hurles, Matthew E.
    Knoppers, Bartha M.
    Korbel, Jan O.
    Lander, Eric S.
    Lee, Charles
    Lehrach, Hans
    Mardis, Elaine R.
    Marth, Gabor T.
    McVean, Gil A.
    Nickerson, Deborah A.
    Schmidt, Jeanette P.
    Sherry, Stephen T.
    Wang, Jun
    Wilson, Richard K.
    Gibbs, Richard A.
    Dinh, Huyen
    Kovar, Christie
    Lee, Sandra
    Lewis, Lora
    Muzny, Donna
    Reid, Jeff
    Wang, Min
    Wang, Jun
    Fang, Xiaodong
    Guo, Xiaosen
    Jian, Min
    Jiang, Hui
    Jin, Xin
    Li, Guoqing
    Li, Jingxiang
    Li, Yingrui
    Li, Zhuo
    Liu, Xiao
    Lu, Yao
    Ma, Xuedi
    Su, Zhe
    Tai, Shuaishuai
    Tang, Meifang
    [J]. NATURE, 2012, 491 (7422) : 56 - 65
  • [4] A Review of Hardware Acceleration for Computational Genomics
    Aluru, Srinivas
    Jammula, Nagakishore
    [J]. IEEE DESIGN & TEST, 2014, 31 (01) : 19 - 30
  • [5] [Anonymous], INT C EMB COMP SYST
  • [6] Reconfigurable Acceleration of Short Read Mapping
    Arram, James
    Tsoi, Kuen Hung
    Luk, Wayne
    Jiang, Peiyong
    [J]. 2013 IEEE 21ST ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM), 2013, : 210 - 217
  • [7] Canzar S, 2015, P IEEE, P1
  • [8] BitMapper: an efficient all-mapper based on bit-vector computing
    Cheng, Haoyu
    Jiang, Huaipan
    Yang, Jiaoyun
    Xu, Yun
    Shang, Yi
    [J]. BMC BIOINFORMATICS, 2015, 16
  • [9] SHRiMP2: Sensitive yet Practical Short Read Mapping
    David, Matei
    Dzamba, Misko
    Lister, Dan
    Ilie, Lucian
    Brudno, Michael
    [J]. BIOINFORMATICS, 2011, 27 (07) : 1011 - 1012
  • [10] Hach F., 2014, NUCLEIC ACIDS RES