IMSA: Integrated Metagenomic Sequence Analysis for Identification of Exogenous Reads in a Host Genomic Background

被引:14
作者
Dimon, Michelle T. [1 ]
Wood, Henry M. [2 ]
Rabbitts, Pamela H. [2 ]
Arron, Sarah T. [1 ]
机构
[1] Univ Calif San Francisco, Dept Dermatol, San Francisco, CA 94143 USA
[2] St James Univ Hosp, Leeds Inst Mol Med, Leeds, W Yorkshire, England
基金
美国国家卫生研究院;
关键词
MICROBES; VIRUSES;
D O I
10.1371/journal.pone.0064546
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Metagenomics, the study of microbial genomes within diverse environments, is a rapidly developing field. The identification of microbial sequences within a host organism enables the study of human intestinal, respiratory, and skin microbiota, and has allowed the identification of novel viruses in diseases such as Merkel cell carcinoma. There are few publicly available tools for metagenomic high throughput sequence analysis. We present Integrated Metagenomic Sequence Analysis (IMSA), a flexible, fast, and robust computational analysis pipeline that is available for public use. IMSA takes input sequence from high throughput datasets and uses a user-defined host database to filter out host sequence. IMSA then aligns the filtered reads to a user-defined universal database to characterize exogenous reads within the host background. IMSA assigns a score to each node of the taxonomy based on read frequency, and can output this as a taxonomy report suitable for cluster analysis or as a taxonomy map (TaxMap). IMSA also outputs the specific sequence reads assigned to a taxon of interest for downstream analysis. We demonstrate the use of IMSA to detect pathogens and normal flora within sequence data from a primary human cervical cancer carrying HPV16, a primary human cutaneous squamous cell carcinoma carrying HPV 16, the CaSki cell line carrying HPV16, and the HeLa cell line carrying HPV18.
引用
收藏
页数:7
相关论文
共 24 条
[1]   BASIC LOCAL ALIGNMENT SEARCH TOOL [J].
ALTSCHUL, SF ;
GISH, W ;
MILLER, W ;
MYERS, EW ;
LIPMAN, DJ .
JOURNAL OF MOLECULAR BIOLOGY, 1990, 215 (03) :403-410
[2]  
[Anonymous], J BIOTECHNOL
[3]  
[Anonymous], BIOINFORMATICS
[4]  
[Anonymous], FEMS MICROBIOL LETT
[5]  
Arron S.T., 2011, J Invest Dermatol
[6]  
Benson DA, 2013, NUCLEIC ACIDS RES, V41, pD36, DOI [10.1093/nar/gkn723, 10.1093/nar/gkp1024, 10.1093/nar/gkw1070, 10.1093/nar/gkr1202, 10.1093/nar/gkx1094, 10.1093/nar/gkl986, 10.1093/nar/gkq1079, 10.1093/nar/gks1195, 10.1093/nar/gkg057]
[7]   Evaluation of High-Throughput Sequencing for Identifying Known and Unknown Viruses in Biological Samples [J].
Cheval, Justine ;
Sauvage, Virginie ;
Frangeul, Lionel ;
Dacheux, Laurent ;
Guigon, Ghislaine ;
Dumey, Nicolas ;
Pariente, Kevin ;
Rousseaux, Claudine ;
Dorange, Fabien ;
Berthet, Nicolas ;
Brisse, Sylvain ;
Moszer, Ivan ;
Bourhy, Herve ;
Manuguerra, Claude Jean ;
Lecuit, Marc ;
Burguiere, Ana ;
Caro, Valerie ;
Eloit, Marc .
JOURNAL OF CLINICAL MICROBIOLOGY, 2011, 49 (09) :3268-3275
[8]   Next-Generation Sequencing for Simultaneous Determination of Human Papillomavirus Load, Subtype, and Associated Genomic Copy Number Changes in Tumors [J].
Conway, Caroline ;
Chalkley, Rebecca ;
High, Alec ;
Maclennan, Kenneth ;
Berri, Stefano ;
Chengot, Preetha ;
Alsop, Melissa ;
Egan, Philip ;
Morgan, Joanne ;
Taylor, Graham R. ;
Chester, John ;
Sen, Mehmet ;
Rabbitts, Pamela ;
Wood, Henry M. .
JOURNAL OF MOLECULAR DIAGNOSTICS, 2012, 14 (02) :104-111
[9]   Cluster analysis and display of genome-wide expression patterns [J].
Eisen, MB ;
Spellman, PT ;
Brown, PO ;
Botstein, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1998, 95 (25) :14863-14868
[10]  
Feng HC, 2008, SCIENCE, V319, P1096, DOI 10.1126/science.1152586