Benchmarking hybrid assembly approaches for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing

被引:53
作者
Chen, Zhao [1 ,2 ]
Erickson, David L. [1 ,2 ]
Meng, Jianghong [1 ]
机构
[1] Univ Maryland, Joint Inst Food Safety & Appl Nutr, Ctr Food Safety & Secur Syst, College Pk, MD 20742 USA
[2] Univ Maryland, Dept Nutr & Food Sci, College Pk, MD 20742 USA
关键词
Illumina sequencing; Oxford Nanopore sequencing; Hybrid assembly; MaSuRCA; SPAdes; Unicycler; Bacterial pathogen; Genomic analyses; KLEBSIELLA-PNEUMONIAE; IDENTIFICATION; ANNOTATION;
D O I
10.1186/s12864-020-07041-8
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
BackgroundWe benchmarked the hybrid assembly approaches of MaSuRCA, SPAdes, and Unicycler for bacterial pathogens using Illumina and Oxford Nanopore sequencing by determining genome completeness and accuracy, antimicrobial resistance (AMR), virulence potential, multilocus sequence typing (MLST), phylogeny, and pan genome. Ten bacterial species (10 strains) were tested for simulated reads of both mediocre- and low-quality, whereas 11 bacterial species (12 strains) were tested for real reads.ResultsUnicycler performed the best for achieving contiguous genomes, closely followed by MaSuRCA, while all SPAdes assemblies were incomplete. MaSuRCA was less tolerant of low-quality long reads than SPAdes and Unicycler. The hybrid assemblies of five antimicrobial-resistant strains with simulated reads provided consistent AMR genotypes with the reference genomes. The MaSuRCA assembly of Staphylococcus aureus with real reads contained msr(A) and tet(K), while the reference genome and SPAdes and Unicycler assemblies harbored blaZ. The AMR genotypes of the reference genomes and hybrid assemblies were consistent for the other five antimicrobial-resistant strains with real reads. The numbers of virulence genes in all hybrid assemblies were similar to those of the reference genomes, irrespective of simulated or real reads. Only one exception existed that the reference genome and hybrid assemblies of Pseudomonas aeruginosa with mediocre-quality long reads carried 241 virulence genes, whereas 184 virulence genes were identified in the hybrid assemblies of low-quality long reads. The MaSuRCA assemblies of Escherichia coli O157:H7 and Salmonella Typhimurium with mediocre-quality long reads contained 126 and 118 virulence genes, respectively, while 110 and 107 virulence genes were detected in their MaSuRCA assemblies of low-quality long reads, respectively. All approaches performed well in our MLST and phylogenetic analyses. The pan genomes of the hybrid assemblies of S. Typhimurium with mediocre-quality long reads were similar to that of the reference genome, while SPAdes and Unicycler were more tolerant of low-quality long reads than MaSuRCA for the pan-genome analysis. All approaches functioned well in the pan-genome analysis of Campylobacter jejuni with real reads.ConclusionsOur research demonstrates the hybrid assembly pipeline of Unicycler as a superior approach for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing.
引用
收藏
页数:21
相关论文
共 46 条
[1]  
Abdelhamed H, 2018, GENOME ANNOUNCEMENTS, V6, DOI [10.1128/genomea.00387-18, 10.1128/genomeA.00387-18]
[2]   HYBRIDSPADES: an algorithm for hybrid assembly of short and long reads [J].
Antipov, Dmitry ;
Korobeynikov, Anton ;
McLean, Jeffrey S. ;
Pevzner, Pavel A. .
BIOINFORMATICS, 2016, 32 (07) :1009-1015
[3]   MinION nanopore sequencing identifies the position and structure of a bacterial antibiotic resistance island [J].
Ashton, Philip M. ;
Nair, Satheesh ;
Dallman, Tim ;
Rubino, Salvatore ;
Rabsch, Wolfgang ;
Mwaigwisya, Solomon ;
Wain, John ;
O'Grady, Justin .
NATURE BIOTECHNOLOGY, 2015, 33 (03) :296-+
[4]   SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing [J].
Bankevich, Anton ;
Nurk, Sergey ;
Antipov, Dmitry ;
Gurevich, Alexey A. ;
Dvorkin, Mikhail ;
Kulikov, Alexander S. ;
Lesin, Valery M. ;
Nikolenko, Sergey I. ;
Son Pham ;
Prjibelski, Andrey D. ;
Pyshkin, Alexey V. ;
Sirotkin, Alexander V. ;
Vyahhi, Nikolay ;
Tesler, Glenn ;
Alekseyev, Max A. ;
Pevzner, Pavel A. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2012, 19 (05) :455-477
[5]   Next-generation sequencing technologies and their application to the study and control of bacterial infections [J].
Besser, J. ;
Carleton, H. A. ;
Gerner-Smidt, P. ;
Lindsey, R. L. ;
Trees, E. .
CLINICAL MICROBIOLOGY AND INFECTION, 2018, 24 (04) :335-341
[6]   Use of Whole-Genome Sequencing for Food Safety and Public Health in the United States [J].
Brown, Eric ;
Dessai, Uday ;
McGarry, Sherri ;
Gerner-Smidt, Peter .
FOODBORNE PATHOGENS AND DISEASE, 2019, 16 (07) :441-450
[7]   Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions [J].
Cali, Damla Senol ;
Kim, Jeremie S. ;
Ghose, Saugata ;
Alkan, Can ;
Mutlu, Onur .
BRIEFINGS IN BIOINFORMATICS, 2019, 20 (04) :1542-1559
[8]   In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing [J].
Carattoli, Alessandra ;
Zankari, Ea ;
Garcia-Fernandez, Aurora ;
Larsen, Mette Voldby ;
Lund, Ole ;
Villa, Laura ;
Aarestrup, Frank Moller ;
Hasman, Henrik .
ANTIMICROBIAL AGENTS AND CHEMOTHERAPY, 2014, 58 (07) :3895-3903
[9]   VFDB: a reference database for bacterial virulence factors [J].
Chen, LH ;
Yang, J ;
Yu, J ;
Ya, ZJ ;
Sun, LL ;
Shen, Y ;
Jin, Q .
NUCLEIC ACIDS RESEARCH, 2005, 33 :D325-D328
[10]   The value of complete microbial genome Sequencing (you get what you pay for) [J].
Fraser, CM ;
Eisen, JA ;
Nelson, KE ;
Paulsen, IT ;
Salzberg, SL .
JOURNAL OF BACTERIOLOGY, 2002, 184 (23) :6403-6405