NPEBseq: nonparametric empirical bayesian-based procedure for differential expression analysis of RNA-seq data

被引:17
作者
Bi, Yingtao [1 ]
Davuluri, Ramana V. [1 ]
机构
[1] Wistar Inst Anat & Biol, Ctr Syst & Computat Biol, Mol & Cellular Oncogenesis Program, Philadelphia, PA 19104 USA
基金
美国国家卫生研究院;
关键词
ALTERNATIVE TRANSCRIPTION; TECHNICAL VARIABILITY; GENE; NORMALIZATION; BIOCONDUCTOR; REPRODUCIBILITY; INFERENCE; TESTS; MODEL;
D O I
10.1186/1471-2105-14-262
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: RNA-seq, a massive parallel-sequencing-based transcriptome profiling method, provides digital data in the form of aligned sequence read counts. The comparative analyses of the data require appropriate statistical methods to estimate the differential expression of transcript variants across different cell/tissue types and disease conditions. Results: We developed a novel nonparametric empirical Bayesian-based approach (NPEBseq) to model the RNA-seq data. The prior distribution of the Bayesian model is empirically estimated from the data without any parametric assumption, and hence the method is "nonparametric" in nature. Based on this model, we proposed a method for detecting differentially expressed genes across different conditions. We also extended this method to detect differential usage of exons from RNA-seq data. The evaluation of NPEBseq on both simulated and publicly available RNA-seq datasets and comparison with three popular methods showed improved results for experiments with or without biological replicates. Conclusions: NPEBseq can successfully detect differential expression between different conditions not only at gene level but also at exon level from RNA-seq datasets. In addition, NPEBSeq performs significantly better than current methods and can be applied to genome-wide RNA-seq datasets. Sample datasets and R package are available at http://bioinformatics.wistar.upenn.edu/NPEBseq.
引用
收藏
页数:12
相关论文
共 47 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]   Detecting differential usage of exons from RNA-seq data [J].
Anders, Simon ;
Reyes, Alejandro ;
Huber, Wolfgang .
GENOME RESEARCH, 2012, 22 (10) :2008-2017
[3]   Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepCAGE data [J].
Balwierz, Piotr J. ;
Carninci, Piero ;
Daub, Carsten O. ;
Kawai, Jun ;
Hayashizaki, Yoshihide ;
Van Belle, Werner ;
Beisel, Christian ;
van Nimwegen, Erik .
GENOME BIOLOGY, 2009, 10 (07)
[4]   Conservation of an RNA regulatory map between Drosophila and mammals [J].
Brooks, Angela N. ;
Yang, Li ;
Duff, Michael O. ;
Hansen, Kasper D. ;
Park, Jung W. ;
Dudoit, Sandrine ;
Brenner, Steven E. ;
Graveley, Brenton R. .
GENOME RESEARCH, 2011, 21 (02) :193-202
[5]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[6]   Evaluation of DNA microarray results with quantitative gene expression platforms [J].
Canales, Roger D. ;
Luo, Yuling ;
Willey, James C. ;
Austermiller, Bradley ;
Barbacioru, Catalin C. ;
Boysen, Cecilie ;
Hunkapiller, Kathryn ;
Jensen, Roderick V. ;
Knight, Charles R. ;
Lee, Kathleen Y. ;
Ma, Yunqing ;
Maqsodi, Botoul ;
Papallo, Adam ;
Peters, Elizabeth Herness ;
Poulter, Karen ;
Ruppel, Patricia L. ;
Samaha, Raymond R. ;
Shi, Leming ;
Yang, Wen ;
Zhang, Lu ;
Goodsaid, Federico M. .
NATURE BIOTECHNOLOGY, 2006, 24 (09) :1115-1122
[7]  
Dillies MA, 2012, BRIEFINGS BIOINF
[8]   BioMart and Bioconductor: a powerful link between biological databases and microarray data analysis [J].
Durinck, S ;
Moreau, Y ;
Kasprzyk, A ;
Davis, S ;
De Moor, B ;
Brazma, A ;
Huber, W .
BIOINFORMATICS, 2005, 21 (16) :3439-3440
[9]   Methods for approximating integrals in statistics with special emphasis on Bayesian integration problems [J].
Evans, M ;
Swartz, T .
STATISTICAL SCIENCE, 1995, 10 (03) :254-272
[10]   GFOLD: a generalized fold change for ranking differentially expressed genes from RNA-seq data [J].
Feng, Jianxing ;
Meyer, Clifford A. ;
Wang, Qian ;
Liu, Jun S. ;
Liu, X. Shirley ;
Zhang, Yong .
BIOINFORMATICS, 2012, 28 (21) :2782-2788