TSSAR: TSS annotation regime for dRNA-seq data

被引:39
作者
Amman, Fabian [1 ,2 ,3 ]
Wolfinger, Michael T. [3 ,4 ,5 ]
Lorenz, Ronny [3 ]
Hofacker, Ivo L. [3 ,6 ,10 ]
Stadler, Peter F. [1 ,2 ,3 ,6 ,7 ,8 ,9 ]
Findeiss, Sven [3 ,10 ]
机构
[1] Univ Leipzig, Dept Comp Sci, Bioinformat Grp, D-04107 Leipzig, Germany
[2] Univ Leipzig, Interdisciplinary Ctr Bioinformat, D-04107 Leipzig, Germany
[3] Univ Vienna, Inst Theoret Chem, A-1090 Vienna, Austria
[4] Med Univ Vienna, Univ Vienna, Ctr Integrat Bioinformat Vienna CIBIV, Max F Perutz Labs, A-1030 Vienna, Austria
[5] Univ Vienna, Dept Biochem & Mol Cell Biol, Max F Perutz Labs, A-1030 Vienna, Austria
[6] Univ Copenhagen, Ctr RNA Technol & Hlth, Frederiksberg C, Denmark
[7] Max Planck Inst Math Sci, D-04103 Leipzig, Germany
[8] Fraunhofer Inst Cell Therapy & Immunol, D-04103 Leipzig, Germany
[9] Santa Fe Inst, Santa Fe, NM 87501 USA
[10] Univ Vienna, Fac Comp Sci, Res Grp Bioinformat & Computat Biol, A-1090 Vienna, Austria
来源
BMC BIOINFORMATICS | 2014年 / 15卷
基金
奥地利科学基金会;
关键词
Differential RNA sequencing; dRNA-seq; TSS; Transcription start site annotation; Transcriptome; RESTful Web service; Next generation sequencing; PRIMARY TRANSCRIPTOME; START SITES; RNA; ARCHITECTURE; DYNAMICS;
D O I
10.1186/1471-2105-15-89
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Differential RNA sequencing (dRNA-seq) is a high-throughput screening technique designed to examine the architecture of bacterial operons in general and the precise position of transcription start sites (TSS) in particular. Hitherto, dRNA-seq data were analyzed by visualizing the sequencing reads mapped to the reference genome and manually annotating reliable positions. This is very labor intensive and, due to the subjectivity, biased. Results: Here, we present TSSAR, a tool for automated de novo TSS annotation from dRNA-seq data that respects the statistics of dRNA-seq libraries. TSSAR uses the premise that the number of sequencing reads starting at a certain genomic position within a transcriptional active region follows a Poisson distribution with a parameter that depends on the local strength of expression. The differences of two dRNA-seq library counts thus follow a Skellam distribution. This provides a statistical basis to identify significantly enriched primary transcripts. We assessed the performance by analyzing a publicly available dRNA-seq data set using TSSAR and two simple approaches that utilize user-defined score cutoffs. We evaluated the power of reproducing the manual TSS annotation. Furthermore, the same data set was used to reproduce 74 experimentally validated TSS in H. pylori from reliable techniques such as RACE or primer extension. Both analyses showed that TSSAR outperforms the static cutoff-dependent approaches. Conclusions: Having an automated and efficient tool for analyzing dRNA-seq data facilitates the use of the dRNA-seq technique and promotes its application to more sophisticated analysis. For instance, monitoring the plasticity and dynamics of the transcriptomal architecture triggered by different stimuli and growth conditions becomes possible. The main asset of a novel tool for dRNA-seq analysis that reaches out to a broad user community is usability. As such, we provide TSSAR both as intuitive RESTfulWeb service (http://rna.tbi.univie.ac.at/TSSAR) together with a set of postprocessing and analysis tools, as well as a stand-alone version for use in high-throughput dRNA-seq data analysis pipelines.
引用
收藏
页数:11
相关论文
共 35 条
[21]  
Quinlan AR, BEDTOOLS USER MANUAL
[22]   The architecture and ppGpp-dependent expression of the primary transcriptome of Salmonella Typhimurium during invasion gene expression [J].
Ramachandran, Vinoy K. ;
Shearer, Neil ;
Jacob, Jobin J. ;
Sharma, Cynthia M. ;
Thompson, Arthur .
BMC GENOMICS, 2012, 13
[23]   Functional Characterization of the RNA Chaperone Hfq in the Opportunistic Human Pathogen Stenotrophomonas maltophilia [J].
Roscetto, Emanuela ;
Angrisano, Tiziana ;
Costa, Valerio ;
Casalino, Mariassunta ;
Foerstner, Konrad U. ;
Sharma, Cynthia M. ;
Di Nocera, Pier Paolo ;
De Gregorio, Eliana .
JOURNAL OF BACTERIOLOGY, 2012, 194 (21) :5864-5874
[24]  
Sayers EW, 2012, NUCLEIC ACIDS RES, V40, pD13, DOI [10.1093/nar/gkr1184, 10.1093/nar/gky1069, 10.1093/nar/gks1189]
[25]   Global mapping of transcription start sites and promoter motifs in the symbiotic a-proteobacterium Sinorhizobium meliloti 1021 [J].
Schlueter, Jan-Philip ;
Reinkensmeier, Jan ;
Barnett, Melanie J. ;
Lang, Claus ;
Krol, Elizaveta ;
Giegerich, Robert ;
Long, Sharon R. ;
Becker, Anke .
BMC GENOMICS, 2013, 14
[26]   Genome-wide transcriptome analysis of the plant pathogen Xanthomonas identifies sRNAs with putative virulence functions [J].
Schmidtke, Cornelius ;
Findeiss, Sven ;
Sharma, Cynthia M. ;
Kuhfuss, Juliane ;
Hoffmann, Steve ;
Vogel, Joerg ;
Stadler, Peter F. ;
Bonas, Ulla .
NUCLEIC ACIDS RESEARCH, 2012, 40 (05) :2020-2031
[28]   The primary transcriptome of the major human pathogen Helicobacter pylori [J].
Sharma, Cynthia M. ;
Hoffmann, Steve ;
Darfeuille, Fabien ;
Reignier, Jeremy ;
Findeiss, Sven ;
Sittka, Alexandra ;
Chabas, Sandrine ;
Reiche, Kristin ;
Hackermueller, Joerg ;
Reinhardt, Richard ;
Stadler, Peter F. ;
Vogel, Joerg .
NATURE, 2010, 464 (7286) :250-255
[30]  
Sokolova M, 2006, LECT NOTES COMPUT SC, V4304, P1015