Computational analysis of bacterial RNA-Seq data

被引:449
作者
McClure, Ryan [1 ,2 ]
Balasubramanian, Divya [3 ]
Sun, Yan [3 ]
Bobrovskyy, Maksym [3 ]
Sumby, Paul [4 ]
Genco, Caroline A. [1 ,2 ]
Vanderpool, Carin K. [3 ]
Tjaden, Brian [5 ]
机构
[1] Boston Univ, Sch Med, Dept Microbiol, Boston, MA 02118 USA
[2] Boston Univ, Sch Med, Infect Dis Sect, Dept Med, Boston, MA 02118 USA
[3] Univ Illinois, Dept Microbiol, Urbana, IL 61801 USA
[4] Methodist Hosp, Res Inst, Dept Pathol, Ctr Mol & Translat Human Infect Dis Res, Houston, TX 77030 USA
[5] Wellesley Coll, Dept Comp Sci, Wellesley, MA 02481 USA
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
GENE-EXPRESSION; DIFFERENTIAL EXPRESSION; OPERON PREDICTION; READ ALIGNMENT; DISCOVERY; SEQUENCES; TRANSCRIPTOMES; COMPLEXITY; ULTRAFAST; PATHOGEN;
D O I
10.1093/nar/gkt444
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Recent advances in high-throughput RNA sequencing (RNA-seq) have enabled tremendous leaps forward in our understanding of bacterial transcriptomes. However, computational methods for analysis of bacterial transcriptome data have not kept pace with the large and growing data sets generated by RNA-seq technology. Here, we present new algorithms, specific to bacterial gene structures and transcriptomes, for analysis of RNA-seq data. The algorithms are implemented in an open source software system called Rockhopper that supports various stages of bacterial RNA-seq data analysis, including aligning sequencing reads to a genome, constructing transcriptome maps, quantifying transcript abundance, testing for differential gene expression, determining operon structures and visualizing results. We demonstrate the performance of Rockhopper using 2.1 billion sequenced reads from 75 RNA-seq experiments conducted with Escherichia coli, Neisseria gonorrhoeae, Salmonella enterica, Streptococcus pyogenes and Xenorhabdus nematophila. We find that the transcriptome maps generated by our algorithms are highly accurate when compared with focused experimental data from E. coli and N. gonorrhoeae, and we validate our system's ability to identify novel small RNAs, operons and transcription start sites. Our results suggest that Rockhopper can be used for efficient and accurate analysis of bacterial RNA-seq data, and that it can aid with elucidation of bacterial transcriptomes.
引用
收藏
页数:16
相关论文
共 58 条
[1]  
AIBA H, 1981, J BIOL CHEM, V256, P1905
[2]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[3]   Novel small RNA-encoding genes in the intergenic regions of Escherichia coli [J].
Argaman, L ;
Hershberg, R ;
Vogel, J ;
Bejerano, G ;
Wagner, EGH ;
Margalit, H ;
Altuvia, S .
CURRENT BIOLOGY, 2001, 11 (12) :941-950
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   A Bayesian network approach to operon prediction [J].
Bockhorst, J ;
Craven, M ;
Page, D ;
Shavlik, J ;
Glasner, J .
BIOINFORMATICS, 2003, 19 (10) :1227-1235
[6]   The relative value of operon predictions [J].
Brouwer, Rutger W. W. ;
Kuipers, Oscar P. ;
van Hijum, Sacha A. F. T. .
BRIEFINGS IN BIOINFORMATICS, 2008, 9 (05) :367-375
[7]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[8]  
Burrows M., 1994, 124 SRS
[9]   Studying bacterial transcriptomes using RNA-seq [J].
Croucher, Nicholas J. ;
Thomson, Nicholas R. .
CURRENT OPINION IN MICROBIOLOGY, 2010, 13 (05) :619-624
[10]   Operon prediction using both genome-specific and general genomic information [J].
Dam, Phuongan ;
Olman, Victor ;
Harris, Kyle ;
Su, Zhengchang ;
Xu, Ying .
NUCLEIC ACIDS RESEARCH, 2007, 35 (01) :288-298