Adapting the GACT-X Aligner to Accelerate Minimap2 in an FPGA Cloud Instance

被引:5
作者
Teng, Carolina [1 ]
Achjian, Renan Weege [2 ]
Wang, Jiang Chau [1 ]
Fonseca, Fernando Josepetti [1 ]
机构
[1] Univ Sao Paulo, Sch Engn, Dept Elect Syst Engn, BR-05508010 Sao Paulo, Brazil
[2] Univ Sao Paulo, Inst Biomed Sci, Dept Parasitol, BR-05508000 Sao Paulo, Brazil
来源
APPLIED SCIENCES-BASEL | 2023年 / 13卷 / 07期
关键词
field programmable gate arrays; cloud computing; FPGA cloud; Minimap2; Smith-Waterman-Gotoh; coprocessors; acceleration; genomics; bioinformatics; hybrid systems; ALIGNMENT;
D O I
10.3390/app13074385
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In genomic analysis, long reads are an emerging type of data processed by assembly algorithms to recover the complete genome sample. They are, on average, one or two orders of magnitude longer than short reads from the previous generation, which provides important advantages in information quality. However, longer sequences bring new challenges to computer processing, undermining the performance of assembly algorithms developed for short reads. This issue is amplified by the exponential growth of genetic data generation and by the slowdown of transistor technology progress, illustrated by Moore's Law. Minimap2 is the current state-of-the-art long-read assembler and takes dozens of CPU hours to assemble a human genome with clinical standard coverage. One of its bottlenecks, the alignment stage, has not been successfully accelerated on FPGAs in the literature. GACT-X is an alignment algorithm developed for FPGA implementation, suitable for any size input sequence. In this work, GACT-X was adapted to work as the aligner of Minimap2, and these are integrated and implemented in an FPGA cloud platform. The measurements for accuracy and speed-up are presented for three different datasets in different combinations of numbers of kernels and threads. The integrated solution's performance limitations due to data transfer are also analyzed and discussed.
引用
收藏
页数:22
相关论文
共 46 条
[1]  
ac.uk, RUN ERR2585114
[2]   Will long-read sequencing technologies replace short-read sequencing technologies in the next 10 years? [J].
Adewale, Boluwatife A. .
AFRICAN JOURNAL OF LABORATORY MEDICINE, 2020, 9 (01)
[3]   Accelerating Genome Analysis: A Primer on an Ongoing Journey [J].
Alser, Mohammed ;
Bingol, Zulal ;
Cali, Damla Senol ;
Kim, Jeremie ;
Ghose, Saugata ;
Alkan, Can ;
Mutlu, Onur .
IEEE MICRO, 2020, 40 (05) :65-75
[4]   Opportunities and challenges in long-read sequencing data analysis [J].
Amarasinghe, Shanika L. ;
Su, Shian ;
Dong, Xueyi ;
Zappia, Luke ;
Ritchie, Matthew E. ;
Gouil, Quentin .
GENOME BIOLOGY, 2020, 21 (01)
[5]  
Amazon, Amazon EC2 F1 Instances.
[6]  
amazon, FPGA DEV AMI
[7]  
[Anonymous], GRCH38, P14
[8]   HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads [J].
Antipov, Dmitry ;
Korobeynikov, Anton ;
McLean, Jeffrey S. ;
Pevzner, Pavel A. .
BIOINFORMATICS, 2016, 32 (07) :1009-1015
[9]   Structural variants identified by Oxford Nanopore PromethION sequencing of the human genome [J].
De Coster, Wouter ;
De Rijk, Peter ;
De Roeck, Arne ;
De Pooter, Tim ;
D'Hert, Svenn ;
Strazisar, Mojca ;
Sleegers, Kristel ;
Van Broeckhoven, Christine .
GENOME RESEARCH, 2019, 29 (07) :1178-1187
[10]   Accelerating Long Read Alignment on Three Processors [J].
Feng, Zonghao ;
Qiu, Shuang ;
Wang, Lipeng ;
Luo, Qiong .
PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,