Hybracter: enabling scalable, automated, complete and accurate bacterial genome assemblies

被引:18
作者
Bouras, George [1 ,2 ,3 ]
Houtak, Ghais [1 ,2 ,3 ]
Wick, Ryan R. [4 ]
Mallawaarachchi, Vijini [5 ]
Roach, Michael J. [5 ,6 ,7 ]
Papudeshi, Bhavya [5 ]
Judd, Lousie M. [4 ]
Sheppard, Anna E. [8 ]
Edwards, Robert A. [5 ]
Vreugde, Sarah [1 ,2 ,3 ]
机构
[1] Univ Adelaide, Fac Hlth & Med Sci, Adelaide Med Sch, Adelaide, Australia
[2] Univ Adelaide, Dept Surg Otolaryngol Head & Neck Surg, Adelaide, SA, Australia
[3] Cent Adelaide Local Hlth Network, Basil Hetzel Inst Translat Hlth Res, Adelaide, SA, Australia
[4] Univ Melbourne, Peter Doherty Inst Infect & Immun, Dept Microbiol & Immunol, Melbourne, Australia
[5] Flinders Univ S Australia, Coll Sci & Engn, Flinders Accelerator Microbiome Explorat, Adelaide, Australia
[6] Univ Adelaide, Adelaide Ctr Epigenet, Adelaide, SA, Australia
[7] Univ Adelaide, South Australian Immunogenom Canc Inst, Adelaide, Australia
[8] Univ Adelaide, Sch Biol Sci, Adelaide, Australia
基金
澳大利亚研究理事会;
关键词
assembly; long-; reads; plasmids; DE-BRUIJN GRAPHS; RESOURCE; READS;
D O I
10.1099/mgen.0.001244
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Improvements in the accuracy and availability of long - read sequencing mean that complete bacterial genomes are now routinely reconstructed using hybrid (i.e. short- and long- reads) assembly approaches. Complete genomes allow a deeper understanding of bacterial evolution and genomic variation beyond single nucleotide variants. They are also crucial for identifying plasmids, which often carry medically significant antimicrobial resistance genes. However, small plasmids are often missed or misassembled by long - read assembly algorithms. Here, we present Hybracter which allows for the fast, automatic and scalable recovery of near- perfect complete bacterial genomes using a long - read first assembly approach. Hybracter can be run either as a hybrid assembler or as a long - read only assembler. We compared Hybracter to existing automated hybrid and long - read only assembly tools using a diverse panel of samples of varying levels of long - read accuracy with manually curated ground truth reference genomes. We demonstrate that Hybracter as a hybrid assembler is more accurate and faster than the existing gold standard automated hybrid assembler Unicycler. We also show that Hybracter with long - reads only is the most accurate long - read only assembler and is comparable to hybrid methods in accurately recovering small plasmids.
引用
收藏
页数:15
相关论文
共 71 条
[1]   Opportunities and challenges in long-read sequencing data analysis [J].
Amarasinghe, Shanika L. ;
Su, Shian ;
Dong, Xueyi ;
Zappia, Luke ;
Ritchie, Matthew E. ;
Gouil, Quentin .
GENOME BIOLOGY, 2020, 21 (01)
[2]   Multiplex de Bruijn graphs enable genome assembly from long, high-fidelity reads [J].
Bankevich, Anton ;
Bzikadze, Andrey V. ;
Kolmogorov, Mikhail ;
Antipov, Dmitry ;
Pevzner, Pavel A. .
NATURE BIOTECHNOLOGY, 2022, 40 (07) :1075-+
[3]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[4]   Porechop_ABI: discovering unknown adapters in Oxford Nanopore Technology sequencing reads for downstream trimming [J].
Bonenfant, Quentin ;
Noe, Laurent ;
Touzet, Helene .
BIOINFORMATICS ADVANCES, 2023, 3 (01)
[5]  
Bouras G., 2024, J OPEN SOURCE SOFTW, V9, P5968, DOI [10.21105/joss.05968, DOI 10.21105/JOSS.05968]
[6]  
Bouras G, 2024, Bioinformatics.
[7]   Plassembler: an automated bacterial plasmid assembly tool [J].
Bouras, George ;
Sheppard, Anna E. ;
Mallawaarachchi, Vijini ;
Vreugde, Sarah .
BIOINFORMATICS, 2023, 39 (07)
[8]   fastp: an ultra-fast all-in-one FASTQ preprocessor [J].
Chen, Shifu ;
Zhou, Yanqing ;
Chen, Yaru ;
Gu, Jia .
BIOINFORMATICS, 2018, 34 (17) :884-890
[9]   A comprehensive update to the Mycobacterium tuberculosis H37Rv reference genome [J].
Chitale, Poonam ;
Lemenze, Alexander D. ;
Fogarty, Emily C. ;
Shah, Avi ;
Grady, Courtney ;
Odom-Mabey, Aubrey R. ;
Johnson, W. Evan ;
Yang, Jason H. ;
Eren, A. Murat ;
Brosch, Roland ;
Kumar, Pradeep ;
Alland, David .
NATURE COMMUNICATIONS, 2022, 13 (01)
[10]   Complete Genome Sequence of Staphylococcus aureus Strain JKD6159, a Unique Australian Clone of ST93-IV Community Methicillin-Resistant Staphylococcus aureus [J].
Chua, Kyra ;
Seemann, Torsten ;
Harrison, Paul F. ;
Davies, John K. ;
Coutts, Scott J. ;
Chen, Honglei ;
Haring, Volker ;
Moore, Robert ;
Howden, Benjamin P. ;
Stinear, Timothy P. .
JOURNAL OF BACTERIOLOGY, 2010, 192 (20) :5556-5557