NextPolish: a fast and efficient genome polishing tool for long-read assembly

被引:683
作者
Hu, Jiang [1 ]
Fan, Junpeng [1 ]
Sun, Zongyi [1 ]
Liu, Shanlin [1 ]
机构
[1] GrandOm Biosci, Beijing 102200, Peoples R China
关键词
ALIGNMENT;
D O I
10.1093/bioinformatics/btz891
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Although long-read sequencing technologies can produce genomes with long contiguity, they suffer from high error rates. Thus, we developed NextPolish, a tool that efficiently corrects sequence errors in genomes assembled with long reads. This new tool consists of two interlinked modules that are designed to score and count K-mers from high quality short reads, and to polish genome assemblies containing large numbers of base errors. Results: When evaluated for the speed and efficiency using human and a plant (Arabidopsis thaliana) genomes, NextPolish outperformed Pilon by correcting sequence errors faster, and with a higher correction accuracy.
引用
收藏
页码:2253 / 2255
页数:3
相关论文
共 15 条
  • [1] Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps
    Belser, Caroline
    Istace, Benjamin
    Denis, Erwan
    Dubarry, Marion
    Baurens, Franc-Christophe
    Falentin, Cyril
    Genete, Mathieu
    Berrabah, Wahiba
    Chevre, Anne-Marie
    Delourme, Regine
    Deniot, Gwenaelle
    Denoeud, France
    Duffe, Philippe
    Engelen, Stefan
    Lemainque, Arnaud
    Manzanares-Dauleux, Maria
    Martin, Guillaume
    Morice, Jerome
    Noel, Benjamin
    Vekemans, Xavier
    D'Hont, Angelique
    Rousseau-Gueutin, Mathieu
    Barbe, Valerie
    Cruaud, Corinne
    Wincker, Patrick
    Aury, Jean-Marc
    [J]. NATURE PLANTS, 2018, 4 (11) : 879 - +
  • [2] Chin CS, 2016, NAT METHODS, V13, P1050, DOI [10.1038/NMETH.4035, 10.1038/nmeth.4035]
  • [3] Garrison E., 2012, GENOMICS, DOI DOI 10.48550/ARXIV.1207.3907
  • [4] Nanopore sequencing and assembly of a human genome with ultra-long reads
    Jain, Miten
    Koren, Sergey
    Miga, Karen H.
    Quick, Josh
    Rand, Arthur C.
    Sasani, Thomas A.
    Tyson, John R.
    Beggs, Andrew D.
    Dilthey, Alexander T.
    Fiddes, Ian T.
    Malla, Sunir
    Marriott, Hannah
    Nieto, Tom
    O'Grady, Justin
    Olsen, Hugh E.
    Pedersen, Brent S.
    Rhie, Arang
    Richardson, Hollian
    Quinlan, Aaron R.
    Snutch, Terrance P.
    Tee, Louise
    Paten, Benedict
    Phillippy, Adam M.
    Simpson, Jared T.
    Loman, Nicholas J.
    Loose, Matthew
    [J]. NATURE BIOTECHNOLOGY, 2018, 36 (04) : 338 - +
  • [5] Minimap2: pairwise alignment for nucleotide sequences
    Li, Heng
    [J]. BIOINFORMATICS, 2018, 34 (18) : 3094 - 3100
  • [6] Li H, 2009, BIOINFORMATICS, V25, P1094, DOI [10.1093/bioinformatics/btp100, 10.1093/bioinformatics/btp324]
  • [7] Computational methods for optical mapping
    Mendelowitz, Lee
    Pop, Mihai
    [J]. GIGASCIENCE, 2014, 3
  • [8] High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell
    Michael, Todd P.
    Jupe, Florian
    Bemm, Felix
    Motley, S. Timothy
    Sandoval, Justin P.
    Lanz, Christa
    Loudet, Olivier
    Weigel, Detlef
    Ecker, Joseph R.
    [J]. NATURE COMMUNICATIONS, 2018, 9
  • [9] HiC-Pro: an optimized and flexible pipeline for Hi-C data processing
    Servant, Nicolas
    Varoquaux, Nelle
    Lajoie, Bryan R.
    Viara, Eric
    Chen, Chong-Jian
    Vert, Jean-Philippe
    Heard, Edith
    Dekker, Job
    Barillot, Emmanuel
    [J]. GENOME BIOLOGY, 2015, 16
  • [10] Long-read sequencing and de novo assembly of a Chinese genome
    Shi, Lingling
    Guo, Yunfei
    Dong, Chengliang
    Huddleston, John
    Yang, Hui
    Han, Xiaolu
    Fu, Aisi
    Li, Quan
    Li, Na
    Gong, Siyi
    Lintner, Katherine E.
    Ding, Qiong
    Wang, Zou
    Hu, Jiang
    Wang, Depeng
    Wang, Feng
    Wang, Lin
    Lyon, Gholson J.
    Guan, Yongtao
    Shen, Yufeng
    Evgrafov, Oleg V.
    Knowles, James A.
    Thibaud-Nissen, Francoise
    Schneider, Valerie
    Yu, Chack-Yung
    Zhou, Libing
    Eichler, Evan E.
    So, Kwok-Fai
    Wang, Kai
    [J]. NATURE COMMUNICATIONS, 2016, 7