Mining RNA-Seq Data for Infections and Contaminations

被引:10
作者
Bonfert, Thomas [1 ]
Csaba, Gergely [1 ]
Zimmer, Ralf [1 ]
Friedel, Caroline C. [1 ]
机构
[1] Univ Munich, Inst Informat, D-80539 Munich, Germany
关键词
PHYLOGENETIC CLASSIFICATION; ACCURATE; SEQUENCES; ALIGNMENT;
D O I
10.1371/journal.pone.0073071
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
RNA sequencing (RNA-seq) provides novel opportunities for transcriptomic studies at nucleotide resolution, including transcriptomics of viruses or microbes infecting a cell. However, standard approaches for mapping the resulting sequencing reads generally ignore alternative sources of expression other than the host cell and are little equipped to address the problems arising from redundancies and gaps among sequenced microbe and virus genomes. We show that screening of sequencing reads for contaminations and infections can be performed easily using ContextMap, our recently developed mapping software. Based on mapping-derived statistics, mapping confidence, similarities and misidentifications (e. g. due to missing genome sequences) of species/strains can be assessed. Performance of our approach is evaluated on three real-life sequencing data sets and compared to state-of-the-art metagenomics tools. In particular, ContextMap vastly outperformed GASiC and GRAMMy in terms of runtime. In contrast to MEGAN4, it was capable of providing individual read mappings to species and resolving non-unique mappings, thus allowing the identification of misalignments caused by sequence similarities between genomes and missing genome sequences. Our study illustrates the importance and potentials of routinely mining RNA-seq experiments for infections or contaminations by microbes and viruses. By using ContextMap, gene expression of infecting agents can be analyzed and novel insights in infection processes and tumorigenesis can be obtained.
引用
收藏
页数:12
相关论文
共 41 条
[1]   A context-based approach to identify the most likely mapping for RNA-seq experiments [J].
Bonfert, Thomas ;
Csaba, Gergely ;
Zimmer, Ralf ;
Friedel, Caroline C. .
BMC BIOINFORMATICS, 2012, 13 :S9
[2]  
Brady A, 2009, NAT METHODS, V6, P673, DOI [10.1038/nmeth.1358, 10.1038/NMETH.1358]
[3]   Fusobacterium nucleatum infection is prevalent in human colorectal carcinoma [J].
Castellarin, Mauro ;
Warren, Rene L. ;
Freeman, J. Douglas ;
Dreolini, Lisa ;
Krzywinski, Martin ;
Strauss, Jaclyn ;
Barnes, Rebecca ;
Watson, Peter ;
Allen-Vercoe, Emma ;
Moore, Richard A. ;
Holt, Robert A. .
GENOME RESEARCH, 2012, 22 (02) :299-306
[4]   RNASEQR-a streamlined and accurate RNA-seq sequence analysis program [J].
Chen, Leslie Y. ;
Wei, Kuo-Chen ;
Huang, Abner C. -Y. ;
Wang, Kai ;
Huang, Chiung-Yin ;
Yi, Danielle ;
Tang, Chuan Yi ;
Galas, David J. ;
Hood, Leroy E. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (06) :e42
[5]   A new metric for probability distributions [J].
Endres, DM ;
Schindelin, JE .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2003, 49 (07) :1858-1860
[6]   Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM) [J].
Grant, Gregory R. ;
Farkas, Michael H. ;
Pizarro, Angel D. ;
Lahens, Nicholas F. ;
Schug, Jonathan ;
Brunk, Brian P. ;
Stoeckert, Christian J. ;
Hogenesch, John B. ;
Pierce, Eric A. .
BIOINFORMATICS, 2011, 27 (18) :2518-2528
[7]   Mammalian microRNAs predominantly act to decrease target mRNA levels [J].
Guo, Huili ;
Ingolia, Nicholas T. ;
Weissman, Jonathan S. ;
Bartel, David P. .
NATURE, 2010, 466 (7308) :835-U66
[8]   SOrt-ITEMS: Sequence orthology based approach for improved taxonomic estimation of metagenomic sequences [J].
Haque, Monzoorul M. ;
Ghosh, Tarini Shankar ;
Komanduri, Dinakar ;
Mande, Sharmila S. .
BIOINFORMATICS, 2009, 25 (14) :1722-1730
[9]   MARTA: a suite of Java']Java-based tools for assigning taxonomic status to DNA sequences [J].
Horton, Matthew ;
Bodenhausen, Natacha ;
Bergelson, Joy .
BIOINFORMATICS, 2010, 26 (04) :568-569
[10]   Integrative analysis of environmental sequences using MEGAN4 [J].
Huson, Daniel H. ;
Mitra, Suparna ;
Ruscheweyh, Hans-Joachim ;
Weber, Nico ;
Schuster, Stephan C. .
GENOME RESEARCH, 2011, 21 (09) :1552-1560