imGLAD: accurate detection and quantification of target organisms in metagenomes

被引:25
作者
Castro, Juan C. [1 ,2 ]
Rodriguez-R, Luis M. [1 ,3 ]
Harvey, William T. [2 ]
Weigand, Michael R. [3 ,4 ]
Hatt, Janet K. [3 ]
Carter, Michelle Q. [5 ]
Konstantinidis, Konstantinos T. [1 ,2 ,3 ]
机构
[1] Georgia Inst Technol, Ctr Bioinformat & Computat Genom, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Sch Biol Sci, Atlanta, GA 30332 USA
[3] Georgia Inst Technol, Sch Civil & Environm Engn, Atlanta, GA 30332 USA
[4] Ctr Dis Control & Prevent, Div Bacterial Dis, Atlanta, GA USA
[5] USDA ARS, Western Reg Res Ctr, Produce Safety & Microbiol, USDA, 800 Buchanan St, Albany, CA 94710 USA
基金
美国国家科学基金会;
关键词
Genomes; Metagenomics; Limit of detection; GENERATION; ALIGNMENT; DATABASE; PROTEIN; GENOMES; BLAST;
D O I
10.7717/peerj.5882
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Accurate detection of target microbial species in metagenomic datasets from environmental samples remains limited because the limit of detection of current methods is typically inaccessible and the frequency of false-positives, resulting from inadequate identification of regions of the genome that are either too highly conserved to be diagnostic (e.g., rRNA genes) or prone to frequent horizontal genetic exchange (e.g., mobile elements) remains unknown. To overcome these limitations, we introduce imGLAD, which aims to detect (target) genomic sequences in metagenomic datasets. imGLAD achieves high accuracy because it uses the sequence-discrete population concept for discriminating between metagenomic reads originating from the target organism compared to reads from co-occurring close relatives, masks regions of the genome that are not informative using the MyTaxa engine, and models both the sequencing breadth and depth to determine relative abundance and limit of detection. We validated imGLAD by analyzing metagenomic datasets derived from spinach leaves inoculated with the enteric pathogen Escherichia coli O157:H7 and showed that its limit of detection can be comparable to that of PCR-based approaches for these samples (similar to 1 cell/gram).
引用
收藏
页数:23
相关论文
共 27 条
[1]   Sigma: Strain-level inference of genomes from metagenomic analysis for biosurveillance [J].
Ahn, Tae-Hyuk ;
Chai, Juanjuan ;
Pan, Chongle .
BIOINFORMATICS, 2015, 31 (02) :170-177
[2]   Gapped BLAST and PSI-BLAST: a new generation of protein database search programs [J].
Altschul, SF ;
Madden, TL ;
Schaffer, AA ;
Zhang, JH ;
Zhang, Z ;
Miller, W ;
Lipman, DJ .
NUCLEIC ACIDS RESEARCH, 1997, 25 (17) :3389-3402
[3]  
[Anonymous], 2006, A GUIDE TO NUMPY
[4]  
[Anonymous], 2001, SciPy: Open source scientific tools for Python
[5]   Detection of Bacillus anthracis DNA in Complex Soil and Air Samples Using Next-Generation Sequencing [J].
Be, Nicholas A. ;
Thissen, James B. ;
Gardner, Shea N. ;
McLoughlin, Kevin S. ;
Fofanov, Viacheslav Y. ;
Koshinsky, Heather ;
Ellingson, Sally R. ;
Brettin, Thomas S. ;
Jackson, Paul J. ;
Jaing, Crystal J. .
PLOS ONE, 2013, 8 (09)
[6]   Bacterial species may exist, metagenomics reveal [J].
Caro-Quintero, Alejandro ;
Konstantinidis, Konstantinos T. .
ENVIRONMENTAL MICROBIOLOGY, 2012, 14 (02) :347-355
[7]   Distinct Acid Resistance and Survival Fitness Displayed by Curli Variants of Enterohemorrhagic Escherichia coli O157:H7 [J].
Carter, Michelle Q. ;
Brandl, Maria T. ;
Louie, Jacqueline W. ;
Kyle, Jennifer L. ;
Carychao, Diana K. ;
Cooley, Michael B. ;
Parker, Craig T. ;
Bates, Anne H. ;
Mandrell, Robert E. .
APPLIED AND ENVIRONMENTAL MICROBIOLOGY, 2011, 77 (11) :3685-3695
[8]   Accurate read-based metagenome characterization using a hierarchical suite of unique signatures [J].
Freitas, Tracey Allen K. ;
Li, Po-E ;
Scholz, Matthew B. ;
Chain, Patrick S. G. .
NUCLEIC ACIDS RESEARCH, 2015, 43 (10)
[9]   DNA-DNA hybridization values and their relationship to whole-genome sequence similarities [J].
Goris, Johan ;
Konstantinidis, Konstantinos T. ;
Klappenbach, Joel A. ;
Coenye, Tom ;
Vandamme, Peter ;
Tiedje, James M. .
INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, 2007, 57 :81-91
[10]   PathoScope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples [J].
Hong, Changjin ;
Manimaran, Solaiappan ;
Shen, Ying ;
Perez-Rogers, Joseph F. ;
Byrd, Allyson L. ;
Castro-Nallar, Eduardo ;
Crandall, Keith A. ;
Johnson, William Evan .
MICROBIOME, 2014, 2