Comparison of De Novo Assembly Strategies for Bacterial Genomes

被引:20
|
作者
Zhang, Pengfei [1 ,2 ]
Jiang, Dike [1 ,2 ]
Wang, Yin [1 ,2 ]
Yao, Xueping [1 ,2 ]
Luo, Yan [1 ,2 ]
Yang, Zexiao [1 ,2 ]
机构
[1] Sichuan Agr Univ, Key Lab Anim Dis & Human Hlth Sichuan Prov, Chengdu 611130, Peoples R China
[2] Agricultural Univ, Coll Vet Med, Chengdu 611130, Peoples R China
关键词
long-read sequencing; genome assembly; protein prediction; NANOPORE; ANNOTATION;
D O I
10.3390/ijms22147668
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
(1) Background: Short-read sequencing allows for the rapid and accurate analysis of the whole bacterial genome but does not usually enable complete genome assembly. Long-read sequencing greatly assists with the resolution of complex bacterial genomes, particularly when combined with short-read Illumina data. However, it is not clear how different assembly strategies affect genomic accuracy, completeness, and protein prediction. (2) Methods: we compare different assembly strategies for Haemophilus parasuis, which causes Glasser's disease, characterized by fibrinous polyserositis and arthritis, in swine by using Illumina sequencing and long reads from the sequencing platforms of either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio). (3) Results: Assembly with either PacBio or ONT reads, followed by polishing with Illumina reads, facilitated high-quality genome reconstruction and was superior to the long-read-only assembly and hybrid-assembly strategies when evaluated in terms of accuracy and completeness. An equally excellent method was correction with Homopolish after the ONT-only assembly, which had the advantage of avoiding hybrid sequencing with Illumina. Furthermore, by aligning transcripts to assembled genomes and their predicted CDSs, the sequencing errors of the ONT assembly were mainly indels that were generated when homopolymer regions were sequenced, thus critically affecting protein prediction. Polishing can fill indels and correct mistakes. (4) Conclusions: The assembly of bacterial genomes can be directly achieved by using long-read sequencing techniques. To maximize assembly accuracy, it is essential to polish the assembly with homologous sequences of related genomes or sequencing data from short-read technology.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] De novo assembly of haplotype-resolved genomes with trio binning
    Sergey Koren
    Arang Rhie
    Brian P Walenz
    Alexander T Dilthey
    Derek M Bickhart
    Sarah B Kingan
    Stefan Hiendleder
    John L Williams
    Timothy P L Smith
    Adam M Phillippy
    Nature Biotechnology, 2018, 36 : 1174 - 1182
  • [22] Haploflow: strain-resolved de novo assembly of viral genomes
    Adrian Fritz
    Andreas Bremges
    Zhi-Luo Deng
    Till Robin Lesker
    Jasper Götting
    Tina Ganzenmueller
    Alexander Sczyrba
    Alexander Dilthey
    Frank Klawonn
    Alice Carolyn McHardy
    Genome Biology, 22
  • [23] Efficient de novo assembly of single-cell bacterial genomes from short-read data sets
    Chitsaz, Hamidreza
    Yee-Greenbaum, Joyclyn L.
    Tesler, Glenn
    Lombardo, Mary-Jane
    Dupont, Christopher L.
    Badger, Jonathan H.
    Novotny, Mark
    Rusch, Douglas B.
    Fraser, Louise J.
    Gormley, Niall A.
    Schulz-Trieglaff, Ole
    Smith, Geoffrey P.
    Evers, Dirk J.
    Pevzner, Pavel A.
    Lasken, Roger S.
    NATURE BIOTECHNOLOGY, 2011, 29 (10) : 915 - U214
  • [24] Efficient de novo assembly of single-cell bacterial genomes from short-read data sets
    Hamidreza Chitsaz
    Joyclyn L Yee-Greenbaum
    Glenn Tesler
    Mary-Jane Lombardo
    Christopher L Dupont
    Jonathan H Badger
    Mark Novotny
    Douglas B Rusch
    Louise J Fraser
    Niall A Gormley
    Ole Schulz-Trieglaff
    Geoffrey P Smith
    Dirk J Evers
    Pavel A Pevzner
    Roger S Lasken
    Nature Biotechnology, 2011, 29 : 915 - 921
  • [25] GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes
    Jin, Jian-Jun
    Yu, Wen-Bin
    Yang, Jun-Bo
    Song, Yu
    dePamphilis, Claude W.
    Yi, Ting-Shuang
    Li, De-Zhu
    GENOME BIOLOGY, 2020, 21 (01)
  • [26] De novo assembly, annotation, and comparative analysis of 26 diverse maize genomes
    Hufford, Matthew B.
    Seetharam, Arun S.
    Woodhouse, Margaret R.
    Chougule, Kapeel M.
    Ou, Shujun
    Liu, Jianing
    Ricci, William A.
    Guo, Tingting
    Olson, Andrew
    Qiu, Yinjie
    Della Coletta, Rafael
    Tittes, Silas
    Hudson, Asher, I
    Marand, Alexandre P.
    Wei, Sharon
    Lu, Zhenyuan
    Wang, Bo
    Tello-Ruiz, Marcela K.
    Piri, Rebecca D.
    Wang, Na
    Kim, Dong Won
    Zeng, Yibing
    O'Connor, Christine H.
    Li, Xianran
    Gilbert, Amanda M.
    Baggs, Erin
    Krasileva, Ksenia, V
    Portwood, John L., II
    Cannon, Ethalinda K. S.
    Andorf, Carson M.
    Manchanda, Nancy
    Snodgrass, Samantha J.
    Hufnagel, David E.
    Jiang, Qiuhan
    Pedersen, Sarah
    Syring, Michael L.
    Kudrna, David A.
    Llaca, Victor
    Fengler, Kevin
    Schmitz, Robert J.
    Ross-Ibarra, Jeffrey
    Yu, Jianming
    Gent, Jonathan, I
    Hirsch, Candice N.
    Ware, Doreen
    Dawe, R. Kelly
    SCIENCE, 2021, 373 (6555) : 655 - +
  • [27] Sequencing and de novo assembly of 150 genomes from Denmark as a population reference
    Lasse Maretty
    Jacob Malte Jensen
    Bent Petersen
    Jonas Andreas Sibbesen
    Siyang Liu
    Palle Villesen
    Laurits Skov
    Kirstine Belling
    Christian Theil Have
    Jose M. G. Izarzugaza
    Marie Grosjean
    Jette Bork-Jensen
    Jakob Grove
    Thomas D. Als
    Shujia Huang
    Yuqi Chang
    Ruiqi Xu
    Weijian Ye
    Junhua Rao
    Xiaosen Guo
    Jihua Sun
    Hongzhi Cao
    Chen Ye
    Johan van Beusekom
    Thomas Espeseth
    Esben Flindt
    Rune M. Friborg
    Anders E. Halager
    Stephanie Le Hellard
    Christina M. Hultman
    Francesco Lescai
    Shengting Li
    Ole Lund
    Peter Løngren
    Thomas Mailund
    Maria Luisa Matey-Hernandez
    Ole Mors
    Christian N. S. Pedersen
    Thomas Sicheritz-Pontén
    Patrick Sullivan
    Ali Syed
    David Westergaard
    Rachita Yadav
    Ning Li
    Xun Xu
    Torben Hansen
    Anders Krogh
    Lars Bolund
    Thorkild I. A. Sørensen
    Oluf Pedersen
    Nature, 2017, 548 : 87 - 91
  • [28] A combined de novo assembly approach increases the quality of prokaryotic draft genomes
    Uğur Çabuk
    Ercan Selçuk Ünlü
    Folia Microbiologica, 2022, 67 : 801 - 810
  • [29] Efficient de novo assembly of large genomes using compressed data structures
    Simpson, Jared T.
    Durbin, Richard
    GENOME RESEARCH, 2012, 22 (03) : 549 - 556
  • [30] NOVOPlasty: de novo assembly of organelle genomes from whole genome data
    Dierckxsens, Nicolas
    Mardulyn, Patrick
    Smits, Guillaume
    NUCLEIC ACIDS RESEARCH, 2017, 45 (04)