Accelerating Long Read Alignment on Three Processors

被引:23
作者
Feng, Zonghao [1 ]
Qiu, Shuang [1 ]
Wang, Lipeng [1 ]
Luo, Qiong [1 ]
机构
[1] HKUST, Kowloon, Hong Kong, Peoples R China
来源
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019) | 2019年
关键词
sequence alignment; parallel processing; ACCURATE; GENOME;
D O I
10.1145/3337821.3337918
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Sequence alignment is a fundamental task in bioinformatics, because many downstream applications rely on it. The recent emergence of the third-generation sequencing technology requires new sequence alignment algorithms that handle longer read lengths as well as more sequencing errors. Furthermore, the rapidly increasing volume of sequence data calls for efficient analysis solutions. To address this need, we propose to utilize commodity parallel processors to perform the long read alignment. Specifically, we propose manymap, an acceleration of the leading CPU-based long read aligner minimap2 on the CPU, the GPU, and the Intel Xeon Phi processor. We eliminate intra-loop data dependency in the base-level alignment step of the original minimap2 through redesigning memory layouts of dynamic programming (DP) matrices. This change facilitates the effective vectorization of the most time-consuming procedure in alignment. Additionally, we apply architecture-aware optimizations, such as utilizing high bandwidth memory on Xeon Phi and concurrent kernel execution on GPU. We evaluate our manymap in comparison with the extended minimap2 on a Xeon Gold 5115 CPU, a Tesla V100 GPU, and a Xeon Phi 7210 processor. Our results show that manymap outperforms minimap2 by up to 2.3 times on the overall execution time and 4.5 times on the base-level alignment step.
引用
收藏
页数:10
相关论文
共 25 条
[1]  
ALTSCHUL SF, 1986, B MATH BIOL, V48, P603, DOI 10.1016/S0092-8240(86)90010-8
[2]  
[Anonymous], 2013, BMC BIOINFORMATICS
[3]  
[Anonymous], ALIGNING SEQUENCE RE, DOI DOI 10.48550/ARXIV.1303.3997
[4]  
[Anonymous], 1994, TECHNICAL REPORT
[5]  
[Anonymous], 2016, Intel Xeon Phi Processor High Performance Programming, DOI [10.1016/B978-0-12-809194-4.00022-3, DOI 10.1016/B978-0-12-809194-4.00022-3]
[6]   Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory [J].
Chaisson, Mark J. ;
Tesler, Glenn .
BMC BIOINFORMATICS, 2012, 13
[7]   Striped Smith-Waterman speeds database searches six times over other SIMD implementations [J].
Farrar, Michael .
BIOINFORMATICS, 2007, 23 (02) :156-161
[8]   Indexing compressed text [J].
Ferragina, P ;
Manzini, G .
JOURNAL OF THE ACM, 2005, 52 (04) :552-581
[9]   Nanopore sequencing and assembly of a human genome with ultra-long reads [J].
Jain, Miten ;
Koren, Sergey ;
Miga, Karen H. ;
Quick, Josh ;
Rand, Arthur C. ;
Sasani, Thomas A. ;
Tyson, John R. ;
Beggs, Andrew D. ;
Dilthey, Alexander T. ;
Fiddes, Ian T. ;
Malla, Sunir ;
Marriott, Hannah ;
Nieto, Tom ;
O'Grady, Justin ;
Olsen, Hugh E. ;
Pedersen, Brent S. ;
Rhie, Arang ;
Richardson, Hollian ;
Quinlan, Aaron R. ;
Snutch, Terrance P. ;
Tee, Louise ;
Paten, Benedict ;
Phillippy, Adam M. ;
Simpson, Jared T. ;
Loman, Nicholas J. ;
Loose, Matthew .
NATURE BIOTECHNOLOGY, 2018, 36 (04) :338-+
[10]   Minimap2: pairwise alignment for nucleotide sequences [J].
Li, Heng .
BIOINFORMATICS, 2018, 34 (18) :3094-3100