Comparison of De Novo Assembly Strategies for Bacterial Genomes

被引:20
|
作者
Zhang, Pengfei [1 ,2 ]
Jiang, Dike [1 ,2 ]
Wang, Yin [1 ,2 ]
Yao, Xueping [1 ,2 ]
Luo, Yan [1 ,2 ]
Yang, Zexiao [1 ,2 ]
机构
[1] Sichuan Agr Univ, Key Lab Anim Dis & Human Hlth Sichuan Prov, Chengdu 611130, Peoples R China
[2] Agricultural Univ, Coll Vet Med, Chengdu 611130, Peoples R China
关键词
long-read sequencing; genome assembly; protein prediction; NANOPORE; ANNOTATION;
D O I
10.3390/ijms22147668
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
(1) Background: Short-read sequencing allows for the rapid and accurate analysis of the whole bacterial genome but does not usually enable complete genome assembly. Long-read sequencing greatly assists with the resolution of complex bacterial genomes, particularly when combined with short-read Illumina data. However, it is not clear how different assembly strategies affect genomic accuracy, completeness, and protein prediction. (2) Methods: we compare different assembly strategies for Haemophilus parasuis, which causes Glasser's disease, characterized by fibrinous polyserositis and arthritis, in swine by using Illumina sequencing and long reads from the sequencing platforms of either Oxford Nanopore Technologies (ONT) or SMRT Pacific Biosciences (PacBio). (3) Results: Assembly with either PacBio or ONT reads, followed by polishing with Illumina reads, facilitated high-quality genome reconstruction and was superior to the long-read-only assembly and hybrid-assembly strategies when evaluated in terms of accuracy and completeness. An equally excellent method was correction with Homopolish after the ONT-only assembly, which had the advantage of avoiding hybrid sequencing with Illumina. Furthermore, by aligning transcripts to assembled genomes and their predicted CDSs, the sequencing errors of the ONT assembly were mainly indels that were generated when homopolymer regions were sequenced, thus critically affecting protein prediction. Polishing can fill indels and correct mistakes. (4) Conclusions: The assembly of bacterial genomes can be directly achieved by using long-read sequencing techniques. To maximize assembly accuracy, it is essential to polish the assembly with homologous sequences of related genomes or sequencing data from short-read technology.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] De novo assembly of human genomes
    Ameur, Adam
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2022, 30 (SUPPL 1) : 12 - 12
  • [2] De novo assembly of bacterial genomes with repetitive DNA regions by dnaasm application
    Kusmirek, Wiktor
    Nowak, Robert
    BMC BIOINFORMATICS, 2018, 19
  • [3] Employing whole genome mapping for optimal de novo assembly of bacterial genomes
    Xavier B.B.
    Sabirova J.
    Pieter M.
    Hernalsteens J.-P.
    De Greve H.
    Goossens H.
    Malhotra-Kumar S.
    BMC Research Notes, 7 (1)
  • [4] De novo assembly of bacterial genomes with repetitive DNA regions by dnaasm application
    Wiktor Kuśmirek
    Robert Nowak
    BMC Bioinformatics, 19
  • [5] Assembler for de novo assembly of large genomes
    Chu, Te-Chin
    Lu, Chen-Hua
    Liu, Tsunglin
    Lee, Greg C.
    Li, Wen-Hsiung
    Shih, Arthur Chun-Chieh
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (36) : E3417 - E3424
  • [6] Tools and Strategies for Long-Read Sequencing and De Novo Assembly of Plant Genomes
    Jung, Hyungtaek
    Winefield, Christopher
    Bombarely, Aureliano
    Prentis, Peter
    Waterhouse, Peter
    TRENDS IN PLANT SCIENCE, 2019, 24 (08) : 700 - 724
  • [7] An Integrated Pipeline for de Novo Assembly of Microbial Genomes
    Tritt, Andrew
    Eisen, Jonathan A.
    Facciotti, Marc T.
    Darling, Aaron E.
    PLOS ONE, 2012, 7 (09):
  • [8] Genetic variation and the de novo assembly of human genomes
    Mark J. P. Chaisson
    Richard K. Wilson
    Evan E. Eichler
    Nature Reviews Genetics, 2015, 16 : 627 - 640
  • [9] Towards Accurate De Novo Assembly for Genomes with Repeats
    Bucur, Doina
    2017 IEEE CONFERENCE ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (CIBCB), 2017, : 106 - +
  • [10] Is the whole greater than the sum of its parts? De novo assembly strategies for bacterial genomes based on paired-end sequencing
    Ting-Wen Chen
    Ruei-Chi Gan
    Yi-Feng Chang
    Wei-Chao Liao
    Timothy H. Wu
    Chi-Ching Lee
    Po-Jung Huang
    Cheng-Yang Lee
    Yi-Ywan M. Chen
    Cheng-Hsun Chiu
    Petrus Tang
    BMC Genomics, 16