Comparison of De Novo Assembly Strategies for Bacterial Genomes

被引:20
作者
Zhang, Pengfei [1 ,2 ]
Jiang, Dike [1 ,2 ]
Wang, Yin [1 ,2 ]
Yao, Xueping [1 ,2 ]
Luo, Yan [1 ,2 ]
Yang, Zexiao [1 ,2 ]
机构
[1] Sichuan Agr Univ, Key Lab Anim Dis & Human Hlth Sichuan Prov, Chengdu 611130, Peoples R China
[2] Agricultural Univ, Coll Vet Med, Chengdu 611130, Peoples R China
关键词
long-read sequencing; genome assembly; protein prediction; NANOPORE; ANNOTATION;
D O I
10.3390/ijms22147668
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
(1) Background: Short-read sequencing allows for the rapid and accurate analysis of the whole bacterial genome but does not usually enable complete genome assembly. Long-read sequencing greatly assists with the resolution of complex bacterial genomes, particularly when combined with short-read Illumina data. However, it is not clear how different assembly strategies affect genomic accuracy, completeness, and protein prediction. (2) Methods: we compare different assembly strategies for Haemophilus parasuis, which causes Glasser's disease, characterized by fibrinous polyserositis and arthritis, in swine by using Illumina sequencing and long reads from the sequencing platforms of either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio). (3) Results: Assembly with either PacBio or ONT reads, followed by polishing with Illumina reads, facilitated high-quality genome reconstruction and was superior to the long-read-only assembly and hybrid-assembly strategies when evaluated in terms of accuracy and completeness. An equally excellent method was correction with Homopolish after the ONT-only assembly, which had the advantage of avoiding hybrid sequencing with Illumina. Furthermore, by aligning transcripts to assembled genomes and their predicted CDSs, the sequencing errors of the ONT assembly were mainly indels that were generated when homopolymer regions were sequenced, thus critically affecting protein prediction. Polishing can fill indels and correct mistakes. (4) Conclusions: The assembly of bacterial genomes can be directly achieved by using long-read sequencing techniques. To maximize assembly accuracy, it is essential to polish the assembly with homologous sequences of related genomes or sequencing data from short-read technology.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
    Madeleine Carruthers
    Andrey A. Yurchenko
    Julian J. Augley
    Colin E. Adams
    Pawel Herzyk
    Kathryn R. Elmer
    BMC Genomics, 19
  • [32] De novo transcriptome assembly, annotation and comparison of four ecological and evolutionary model salmonid fish species
    Carruthers, Madeleine
    Yurchenko, Andrey A.
    Augley, Julian J.
    Adams, Colin E.
    Herzyk, Pawel
    Elmer, Kathryn R.
    BMC GENOMICS, 2018, 19
  • [33] De novo whole-genome assembly of Chrysanthemum makinoi, a key wild chrysanthemum
    van Lieshout, Natascha
    van Kaauwen, Martijn
    Kodde, Linda
    Arens, Paul
    Smulders, Marinus J. M.
    Visser, Richard G. F.
    Finkers, Richard
    G3-GENES GENOMES GENETICS, 2021, 12 (01):
  • [34] Snow alga Sanguina aurantia as revealed through de novo genome assembly and annotation
    Raymond, Breanna B.
    Guenzi-Tiberi, Pierre
    Marechal, Eric
    Quarmby, Lynne M.
    G3-GENES GENOMES GENETICS, 2024, 14 (10):
  • [35] High-quality chromosome-level de novo assembly of the Trifolium repens
    Wang, Hongjie
    Wu, Yongqiang
    He, Yong
    Li, Guoyu
    Ma, Lichao
    Li, Shuo
    Huang, Jianwei
    Yang, Guofeng
    BMC GENOMICS, 2023, 24 (01)
  • [36] De novo whole-genome assembly of Chrysanthemum makinoi, a key wild chrysanthemum
    van Lieshout, Natascha
    van Kaauwen, Martijn
    Kodde, Linda
    Arens, Paul
    Smulders, Marinus J. M.
    Visser, Richard G. F.
    Finkers, Richard
    G3-GENES GENOMES GENETICS, 2022, 12 (01):
  • [37] A De Novo Whole Genome Assembly and Annotation of Parelaphostrongylus tenuis
    Garwood, Tyler J.
    Richards, Jessie E.
    Macchietto, Marissa G.
    Gerhold, Richard W.
    Kania, Stephen A.
    Garbe, John R.
    Fountain-Jones, Nicholas M.
    Larsen, Peter A.
    Wolf, Tiffany M.
    JOURNAL OF NEMATOLOGY, 2024, 56 (01)
  • [38] De novo transcriptome assembly for the spiny mouse (Acomys cahirinus)
    Mamrot, Jared
    Legaie, Roxane
    Ellery, Stacey J.
    Wilson, Trevor
    Seemann, Torsten
    Powell, David R.
    Gardner, David K.
    Walker, David W.
    Temple-Smith, Peter
    Papenfuss, Anthony T.
    Dickinson, Hayley
    SCIENTIFIC REPORTS, 2017, 7
  • [39] De Novo Assembly Methods for Next Generation Sequencing Data
    He, Yiming
    Zhang, Zhen
    Peng, Xiaoqing
    Wu, Fangxiang
    Wang, Jianxin
    TSINGHUA SCIENCE AND TECHNOLOGY, 2013, 18 (05) : 500 - 514
  • [40] Comparison of De Novo Transcriptome Assemblers and k-mer Strategies Using the Killifish, Fundulus heteroclitus
    Rana, Satshil B.
    Zadlock, Frank J.
    Zhang, Ziping
    Murphy, Wyatt R.
    Bentivegna, Carolyn S.
    PLOS ONE, 2016, 11 (04):