Genome sequencing of bacteria: sequencing, de novo assembly and rapid analysis using open source tools

被引:31
|
作者
Kisand, Veljo [1 ,2 ]
Lettieri, Teresa [2 ]
机构
[1] Univ Tartu, Inst Technol, EE-50411 Tartu, Estonia
[2] Commiss European Communities, Joint Res Ctr, Inst Environm & Sustainabil Rural, Water & Ecosyst Resources Unit, I-21027 Ispra, VA, Italy
来源
BMC GENOMICS | 2013年 / 14卷
关键词
Reference mapping; De novo sequencing; De novo assembly; Automated annotation; Marine bacteria; DATABASE; ANNOTATION;
D O I
10.1186/1471-2164-14-211
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: De novo genome sequencing of previously uncharacterized microorganisms has the potential to open up new frontiers in microbial genomics by providing insight into both functional capabilities and biodiversity. Until recently, Roche 454 pyrosequencing was the NGS method of choice for de novo assembly because it generates hundreds of thousands of long reads(<450 bps), which are presumed to aid in the analysis of uncharacterized genomes. The array of tools for processing NGS data are increasingly free and open source and are often adopted for both their high quality and role in promoting academic freedom. Results: The error rate of pyrosequencing the Alcanivorax borkumensis genome was such that thousands of insertions and deletions were artificially introduced into the finished genome. Despite a high coverage (similar to 30 fold), it did not allow the reference genome to be fully mapped. Reads from regions with errors had low quality, low coverage, or were missing. The main defect of the reference mapping was the introduction of artificial indels into contigs through lower than 100% consensus and distracting gene calling due to artificial stop codons. No assembler was able to perform de novo assembly comparable to reference mapping. Automated annotation tools performed similarly on reference mapped and de novo draft genomes, and annotated most CDSs in the de novo assembled draft genomes. Conclusions: Free and open source software (FOSS) tools for assembly and annotation of NGS data are being developed rapidly to provide accurate results with less computational effort. Usability is not high priority and these tools currently do not allow the data to be processed without manual intervention. Despite this, genome assemblers now readily assemble medium short reads into long contigs (>97-98% genome coverage). A notable gap in pyrosequencing technology is the quality of base pair calling and conflicting base pairs between single reads at the same nucleotide position. Regardless, using draft whole genomes that are not finished and remain fragmented into tens of contigs allows one to characterize unknown bacteria with modest effort.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Genome sequencing of bacteria: sequencing, de novoassembly and rapid analysis using open source tools
    Veljo Kisand
    Teresa Lettieri
    BMC Genomics, 14
  • [2] Rapid de novo assembly of the European eel genome from nanopore sequencing reads
    Hans J. Jansen
    Michael Liem
    Susanne A. Jong-Raadsen
    Sylvie Dufour
    Finn-Arne Weltzien
    William Swinkels
    Alex Koelewijn
    Arjan P. Palstra
    Bernd Pelster
    Herman P. Spaink
    Guido E. van den Thillart
    Ron P. Dirks
    Christiaan V. Henkel
    Scientific Reports, 7
  • [3] Rapid de novo assembly of the European eel genome from nanopore sequencing reads
    Jansen, Hans J.
    Liem, Michael
    Jong-Raadsen, Susanne A.
    Dufour, Sylvie
    Weltzien, Finn-Arne
    Swinkels, William
    Koelewijn, Alex
    Palstra, Arjan P.
    Pelster, Bernd
    Spaink, Herman P.
    van den Thillart, Guido E.
    Dirks, Ron P.
    Henkel, Christiaan V.
    SCIENTIFIC REPORTS, 2017, 7
  • [4] Current challenges in de novo plant genome sequencing and assembly
    Michael C Schatz
    Jan Witkowski
    W Richard McCombie
    Genome Biology, 13
  • [5] Current challenges in de novo plant genome sequencing and assembly
    Schatz, Michael C.
    Witkowski, Jan
    McCombie, W. Richard
    GENOME BIOLOGY, 2012, 13 (04):
  • [6] De novo genome assembly for third generation sequencing data
    Forc, Mateusz
    Kusmirek, Wiktor
    Nowak, Robert M.
    PHOTONICS APPLICATIONS IN ASTRONOMY, COMMUNICATIONS, INDUSTRY, AND HIGH-ENERGY PHYSICS EXPERIMENTS 2018, 2018, 10808
  • [7] Next generation sequencing under de novo genome assembly
    Nimmy, Sonia Farhana
    Kamal, M. S.
    INTERNATIONAL JOURNAL OF BIOMATHEMATICS, 2015, 8 (05)
  • [8] De novo assembly of a new Olea europaea genome accession using nanopore sequencing
    Rao, Guodong
    Zhang, Jianguo
    Liu, Xiaoxia
    Lin, Chunfu
    Xin, Huaigen
    Xue, Li
    Wang, Chenhe
    HORTICULTURE RESEARCH, 2021, 8 (01)
  • [9] A Practical Comparison of De Novo Genome Assembly Software Tools for Next-Generation Sequencing Technologies
    Zhang, Wenyu
    Chen, Jiajia
    Yang, Yang
    Tang, Yifei
    Shang, Jing
    Shen, Bairong
    PLOS ONE, 2011, 6 (03):
  • [10] Next generation shotgun sequencing and the challenges of de novo genome assembly
    Schlebusch, Stephen
    Illing, Nicola
    SOUTH AFRICAN JOURNAL OF SCIENCE, 2012, 108 (11-12) : 37 - 44