UltraStrain: An NGS-Based Ultra Sensitive Strain Typing Method for Salmonella enterica

被引:4
作者
Yang, Wenxian [1 ]
Huang, Lihong [2 ]
Shi, Chong [2 ]
Wang, Liansheng [2 ]
Yu, Rongshan [1 ,2 ]
机构
[1] Xiamen Univ, Aginome XMU Joint Lab, Xiamen, Peoples R China
[2] Xiamen Univ, Sch Informat Sci & Engn, Xiamen, Peoples R China
基金
中国国家自然科学基金;
关键词
metagenomes; next-generation sequencing (NGS); whole genome sequencing (WGS); Salmonella enterica; strain typing; GENE;
D O I
10.3389/fgene.2019.00276
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
In the last few years, advances in next-generation sequencing (NGS) technology for whole genome sequencing (WGS) of foodborne pathogens have provided drastic improvements in food pathogen outbreak surveillance. WGS of foodborne pathogen enables identification of pathogens from food or environmental samples, including difficult-to-detect pathogens in culture-negative infections. Compared to traditional low-resolution methods such as the pulsed-field gel electrophoresis (PFGE), WGS provides advantages to differentiate even closely related strains of the same species, thus enables rapid identification of food-source associated with pathogen outbreak events for a fast mitigation plan. In this paper, we present UltraStrain, which is a fast and ultra sensitive pathogen detection and strain typing method for Salmonella enterica (S. enterica) based on WGS data analysis. In the proposed method, a noise filtering step is first performed where the raw sequencing data are mapped to a synthetic species-specific reference genome generated from S. enterica specific marker sequences to avoid potential interference from closely related species for low spike samples. After that, a statistical learning based method is used to identify candidate strains, from a database of known S. enterica strains, that best explain the retained S. enterica specific reads.Finally, a refinement step is further performed by mapping all the reads before filtering onto the identified top candidate strains, and recalculating the probability of presence for each candidate strain. Experiment results using both synthetic and real sequencing data show that the proposed method is able to identify the correct S. enterica strains from low-spike samples, and outperforms several existing strain-typing methods in terms of sensitivity and accuracy.
引用
收藏
页数:11
相关论文
共 31 条
[1]   Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance [J].
Ahn, Tae-Hyuk ;
Chai, Juanjuan ;
Pan, Chongle .
BIOINFORMATICS, 2015, 31 (02) :170-177
[2]   Reporting of Foodborne Illness by US Consumers and Healthcare Professionals [J].
Arendt, Susan ;
Rajagopal, Lakshman ;
Strohbehn, Catherine ;
Stokes, Nathan ;
Meyer, Janell ;
Mandernach, Steven .
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2013, 10 (08) :3684-3714
[3]   A Universal Method for the Identification of Bacteria Based on General PCR Primers [J].
Barghouthi, Sameer A. .
INDIAN JOURNAL OF MICROBIOLOGY, 2011, 51 (04) :430-444
[4]   Recent and emerging innovations in Salmonella detection: a food and environmental perspective [J].
Bell, Rebecca L. ;
Jarvis, Karen G. ;
Ottesen, Andrea R. ;
McFarland, Melinda A. ;
Brown, Eric W. .
MICROBIAL BIOTECHNOLOGY, 2016, 9 (03) :279-292
[5]   Clinical PathoScope: rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data [J].
Byrd, Allyson L. ;
Perez-Rogers, Joseph F. ;
Manimaran, Solaiappan ;
Castro-Nallar, Eduardo ;
Toma, Ian ;
McCaffrey, Tim ;
Siegel, Marc ;
Benson, Gary ;
Crandall, Keith A. ;
Johnson, William Evan .
BMC BIOINFORMATICS, 2014, 15
[6]   fastp: an ultra-fast all-in-one FASTQ preprocessor [J].
Chen, Shifu ;
Zhou, Yanqing ;
Chen, Yaru ;
Gu, Jia .
BIOINFORMATICS, 2018, 34 (17) :884-890
[7]   Microbial strain-level population structure and genetic diversity from metagenomes [J].
Duy Tin Truong ;
Tett, Adrian ;
Pasolli, Edoardo ;
Huttenhower, Curtis ;
Segata, Nicola .
GENOME RESEARCH, 2017, 27 (04) :626-638
[8]   Pathoscope: Species identification and strain attribution with unassembled sequencing data [J].
Francis, Owen E. ;
Bendall, Matthew ;
Manimaran, Solaiappan ;
Hong, Changjin ;
Clement, Nathan L. ;
Castro-Nallar, Eduardo ;
Snell, Quinn ;
Schaalje, G. Bruce ;
Clement, Mark J. ;
Crandall, Keith A. ;
Johnson, W. Evan .
GENOME RESEARCH, 2013, 23 (10) :1721-1729
[9]  
Garfinkel R. S., 1972, MICHIGAN CITY INTEGE
[10]   PCR PRIMERS AND PROBES FOR THE 16S RIBOSOMAL-RNA GENE OF MOST SPECIES OF PATHOGENIC BACTERIA, INCLUDING BACTERIA FOUND IN CEREBROSPINAL-FLUID [J].
GREISEN, K ;
LOEFFELHOLZ, M ;
PUROHIT, A ;
LEONG, D .
JOURNAL OF CLINICAL MICROBIOLOGY, 1994, 32 (02) :335-351