NextPolish: a fast and efficient genome polishing tool for long-read assembly

被引:788
作者
Hu, Jiang [1 ]
Fan, Junpeng [1 ]
Sun, Zongyi [1 ]
Liu, Shanlin [1 ]
机构
[1] GrandOm Biosci, Beijing 102200, Peoples R China
关键词
ALIGNMENT;
D O I
10.1093/bioinformatics/btz891
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Although long-read sequencing technologies can produce genomes with long contiguity, they suffer from high error rates. Thus, we developed NextPolish, a tool that efficiently corrects sequence errors in genomes assembled with long reads. This new tool consists of two interlinked modules that are designed to score and count K-mers from high quality short reads, and to polish genome assemblies containing large numbers of base errors. Results: When evaluated for the speed and efficiency using human and a plant (Arabidopsis thaliana) genomes, NextPolish outperformed Pilon by correcting sequence errors faster, and with a higher correction accuracy.
引用
收藏
页码:2253 / 2255
页数:3
相关论文
共 15 条
[1]   Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps [J].
Belser, Caroline ;
Istace, Benjamin ;
Denis, Erwan ;
Dubarry, Marion ;
Baurens, Franc-Christophe ;
Falentin, Cyril ;
Genete, Mathieu ;
Berrabah, Wahiba ;
Chevre, Anne-Marie ;
Delourme, Regine ;
Deniot, Gwenaelle ;
Denoeud, France ;
Duffe, Philippe ;
Engelen, Stefan ;
Lemainque, Arnaud ;
Manzanares-Dauleux, Maria ;
Martin, Guillaume ;
Morice, Jerome ;
Noel, Benjamin ;
Vekemans, Xavier ;
D'Hont, Angelique ;
Rousseau-Gueutin, Mathieu ;
Barbe, Valerie ;
Cruaud, Corinne ;
Wincker, Patrick ;
Aury, Jean-Marc .
NATURE PLANTS, 2018, 4 (11) :879-+
[2]  
Chin CS, 2016, NAT METHODS, V13, P1050, DOI [10.1038/nmeth.4035, 10.1038/NMETH.4035]
[3]  
Garrison E., 2012, HAPLOTYPE BASED VARI
[4]   Nanopore sequencing and assembly of a human genome with ultra-long reads [J].
Jain, Miten ;
Koren, Sergey ;
Miga, Karen H. ;
Quick, Josh ;
Rand, Arthur C. ;
Sasani, Thomas A. ;
Tyson, John R. ;
Beggs, Andrew D. ;
Dilthey, Alexander T. ;
Fiddes, Ian T. ;
Malla, Sunir ;
Marriott, Hannah ;
Nieto, Tom ;
O'Grady, Justin ;
Olsen, Hugh E. ;
Pedersen, Brent S. ;
Rhie, Arang ;
Richardson, Hollian ;
Quinlan, Aaron R. ;
Snutch, Terrance P. ;
Tee, Louise ;
Paten, Benedict ;
Phillippy, Adam M. ;
Simpson, Jared T. ;
Loman, Nicholas J. ;
Loose, Matthew .
NATURE BIOTECHNOLOGY, 2018, 36 (04) :338-+
[5]   Minimap2: pairwise alignment for nucleotide sequences [J].
Li, Heng .
BIOINFORMATICS, 2018, 34 (18) :3094-3100
[6]  
Li H, 2009, BIOINFORMATICS, V25, P1094, DOI [10.1093/bioinformatics/btp100, 10.1093/bioinformatics/btp324]
[7]   Computational methods for optical mapping [J].
Mendelowitz, Lee ;
Pop, Mihai .
GIGASCIENCE, 2014, 3
[8]   High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell [J].
Michael, Todd P. ;
Jupe, Florian ;
Bemm, Felix ;
Motley, S. Timothy ;
Sandoval, Justin P. ;
Lanz, Christa ;
Loudet, Olivier ;
Weigel, Detlef ;
Ecker, Joseph R. .
NATURE COMMUNICATIONS, 2018, 9
[9]   HiC-Pro: an optimized and flexible pipeline for Hi-C data processing [J].
Servant, Nicolas ;
Varoquaux, Nelle ;
Lajoie, Bryan R. ;
Viara, Eric ;
Chen, Chong-Jian ;
Vert, Jean-Philippe ;
Heard, Edith ;
Dekker, Job ;
Barillot, Emmanuel .
GENOME BIOLOGY, 2015, 16
[10]   Long-read sequencing and de novo assembly of a Chinese genome [J].
Shi, Lingling ;
Guo, Yunfei ;
Dong, Chengliang ;
Huddleston, John ;
Yang, Hui ;
Han, Xiaolu ;
Fu, Aisi ;
Li, Quan ;
Li, Na ;
Gong, Siyi ;
Lintner, Katherine E. ;
Ding, Qiong ;
Wang, Zou ;
Hu, Jiang ;
Wang, Depeng ;
Wang, Feng ;
Wang, Lin ;
Lyon, Gholson J. ;
Guan, Yongtao ;
Shen, Yufeng ;
Evgrafov, Oleg V. ;
Knowles, James A. ;
Thibaud-Nissen, Francoise ;
Schneider, Valerie ;
Yu, Chack-Yung ;
Zhou, Libing ;
Eichler, Evan E. ;
So, Kwok-Fai ;
Wang, Kai .
NATURE COMMUNICATIONS, 2016, 7