A 135-mW Fully Integrated Data Processor for Next-Generation Sequencing

被引:17
作者
Wu, Yi-Chung [1 ]
Chang, Chia-Hua [2 ]
Hung, Jui-Hung [3 ]
Yang, Chia-Hsiang [4 ,5 ]
机构
[1] Natl Taiwan Univ, Grad Inst Elect Engn, Taipei 10617, Taiwan
[2] Natl Chiao Tung Univ, Inst Biomed Engn, Hsinchu 300, Taiwan
[3] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu 300, Taiwan
[4] Natl Taiwan Univ, Dept Elect Engn, Taipei 10617, Taiwan
[5] Natl Chiao Tung Univ, Dept Elect Engn, Hsinchu 300, Taiwan
关键词
Next-generation sequencing (NGS); DNA mapping; sBWT algorithm; FM-index; suffix array sorting; CMOS digital integrated circuits; SHORT READ ALIGNMENT; VLSI; IMPLEMENTATION; TRANSFORM; TOOL;
D O I
10.1109/TBCAS.2017.2760109
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Next-generation sequencing (NGS) enables high-throughput sequencing, in which short DNA fragments can be sequenced in a massively parallel fashion. However, the essential algorithm behind the succeeding NGS data analysis, DNA mapping, is still excessively time consuming. DNA mapping can be partitioned into two parts: suffix array (SA) sorting and backward searching. Dedicated hardware designs for the less-complex backward searching have been proposed, but feasible hardware for the most complicated part, SA sorting, has never been explored. Based on the memory-efficient sBWT algorithm, this work is the first integrated NGS data processor for the entire DNA mapping. The k-ordered Ferragina and Manzini index used in the sBWT algorithm is leveraged to improve storage capacity and reduce hardware complexity. The proposed NGS data processor realizes the sBWT algorithm through bucket sorting, suffix grouping, and suffix sorting circuits. Key design parameters are analyzed to achieve the optimal performance with respect to hardware cost and execution time. Fabricated in 40-nm CMOS, the NGS data processor dissipates 135 mW at 200 MHz from a 0.9-V supply. With 1-GB external memory, the chip can analyze human DNA within 10 min. This work achieves 43 065x and 8 971x [3208x and 402x] higher energy efficiency (throughput-to-area ratio) than the high-end CPU and GPU solutions, respectively.
引用
收藏
页码:1216 / 1225
页数:10
相关论文
共 28 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
Bentley JL, 1997, PROCEEDINGS OF THE EIGHTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P360
[3]   Low cost sorting circuit for VLSI [J].
Blair, GM .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-FUNDAMENTAL THEORY AND APPLICATIONS, 1996, 43 (06) :515-516
[4]   Short Read Mapping: An Algorithmic Tour [J].
Canzar, Stefan ;
Salzberg, Steven L. .
PROCEEDINGS OF THE IEEE, 2017, 105 (03) :436-458
[5]   sBWT: memory efficient implementation of the hardware-acceleration-friendly Schindler transform for the fast biological sequence mapping [J].
Chang, Chia-Hua ;
Chou, Min-Te ;
Wu, Yi-Chung ;
Hong, Ting-Wei ;
Li, Yun-Lung ;
Yang, Chia-Hsiang ;
Hung, Jui-Hung .
BIOINFORMATICS, 2016, 32 (22) :3498-3500
[6]   SORTCHIP: A VLSI implementation of a hardware algorithm for continuous data sorting [J].
Colavita, AA ;
Cicuttin, A ;
Fratnik, F ;
Capello, G .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2003, 38 (06) :1076-1079
[7]   OPTIMAL VLSI CIRCUITS FOR SORTING [J].
COLE, R ;
SIEGEL, A .
JOURNAL OF THE ACM, 1988, 35 (04) :777-809
[8]   Modular Design of High-Throughput, Low-Latency Sorting Units [J].
Farmahini-Farahani, Amin ;
Duwe, Henry J., III ;
Schulte, Michael J. ;
Compton, Katherine .
IEEE TRANSACTIONS ON COMPUTERS, 2013, 62 (07) :1389-1402
[9]  
Fernandez E., 2016, BIOINFORMATICS, V32, P3498
[10]   Coming of age: ten years of next-generation sequencing technologies [J].
Goodwin, Sara ;
McPherson, John D. ;
McCombie, W. Richard .
NATURE REVIEWS GENETICS, 2016, 17 (06) :333-351