Fast and accurate long-read assembly with wtdbg2

被引:0
作者
Jue Ruan
Heng Li
机构
[1] Agricultural Genomics Institute,Department of Data Sciences
[2] Chinese Academy of Agriculture Sciences,Department of Biomedical Informatics
[3] Peng Cheng Laboratory,undefined
[4] Dana-Farber Cancer Institute,undefined
[5] Harvard Medical School,undefined
[6] Broad Institute,undefined
来源
Nature Methods | 2020年 / 17卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Existing long-read assemblers require thousands of central processing unit hours to assemble a human genome and are being outpaced by sequencing technologies in terms of both throughput and cost. We developed a long-read assembler wtdbg2 (https://github.com/ruanjue/wtdbg2) that is 2–17 times as fast as published tools while achieving comparable contiguity and accuracy. It paves the way for population-scale long-read assembly in future.
引用
收藏
页码:155 / 158
页数:3
相关论文
共 32 条
[1]  
Chin CS(2016)Phased diploid genome assembly with single-molecule real-time sequencing Nat. Methods 13 1050-1054
[2]  
Kolmogorov M(2019)Assembly of long, error-prone reads using repeat graphs Nat. Biotechnol. 37 540-546
[3]  
Yuan J(2017)Canu: scalable and accurate long-read assembly via adaptive Genome Res. 27 722-736
[4]  
Lin Y(2016)-mer weighting and repeat separation Bioinformatics 32 2103-2110
[5]  
Pevzner PA(2017)Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences Nat. Methods 14 1072-1074
[6]  
Koren S(2019)MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads Genome Res. 29 1178-1187
[7]  
Li H(2015)Structural variants identified by Oxford Nanopore PromethION sequencing of the human genome Nat. Biotechnol. 33 623-630
[8]  
Xiao CL(2018)Assembling large genomes with single-molecule sequencing and locality-sensitive hashing Bioinformatics 34 3094-3100
[9]  
De Coster W(2015)Minimap2: pairwise alignment for nucleotide sequences Nat. Rev. Genet. 16 627-640
[10]  
Berlin K(1981)Genetic variation and the de novo assembly of human genomes J. Mol. Biol. 147 195-197