Differential expression analysis for paired RNA-seq data

被引:28
作者
Chung, Lisa M. [1 ]
Ferguson, John P. [2 ]
Zheng, Wei [3 ]
Qian, Feng [4 ]
Bruno, Vincent [5 ]
Montgomery, Ruth R. [4 ]
Zhao, Hongyu [1 ]
机构
[1] Yale Univ, Dept Biostat, Sch Publ Hlth, New Haven, CT 06520 USA
[2] George Washington Univ, Dept Stat, Washington, DC 20052 USA
[3] Novartis Inst BioMed Res, Cambridge, MA USA
[4] Yale Univ, Sch Med, Rheumatol Sect, New Haven, CT USA
[5] Univ Maryland, Sch Med, Dept Microbiol & Immunol, Baltimore, MD 21201 USA
关键词
WEST-NILE-VIRUS; BAYESIAN MIXTURE MODEL; GENE-EXPRESSION; SERIAL ANALYSIS; TRANSCRIPT PROFILES; MULTIPLE GROUPS; SAGE; NORMALIZATION; VARIABILITY; MICROARRAYS;
D O I
10.1186/1471-2105-14-110
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: RNA-Seq technology measures the transcript abundance by generating sequence reads and counting their frequencies across different biological conditions. To identify differentially expressed genes between two conditions, it is important to consider the experimental design as well as the distributional property of the data. In many RNA-Seq studies, the expression data are obtained as multiple pairs, e. g., pre-vs. post-treatment samples from the same individual. We seek to incorporate paired structure into analysis. Results: We present a Bayesian hierarchical mixture model for RNA-Seq data to separately account for the variability within and between individuals from a paired data structure. The method assumes a Poisson distribution for the data mixed with a gamma distribution to account variability between pairs. The effect of differential expression is modeled by two-component mixture model. The performance of this approach is examined by simulated and real data. Conclusions: In this setting, our proposed model provides higher sensitivity than existing methods to detect differential expression. Application to real RNA-Seq data demonstrates the usefulness of this method for detecting expression alteration for genes with low average expression levels or shorter transcript length.
引用
收藏
页数:14
相关论文
共 50 条
[1]  
ANDERS S., 2013, Differential Expression of RNA-Seq Data At the Gene Level - the DESeq Package
[2]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[3]   The significance of digital gene expression profiles [J].
Audic, S ;
Claverie, JM .
GENOME RESEARCH, 1997, 7 (10) :986-995
[4]   Overdispersed logistic regression for SAGE: Modelling multiple groups and covariates [J].
Baggerly, KA ;
Deng, L ;
Morris, JS ;
Aldaz, CM .
BMC BIOINFORMATICS, 2004, 5 (1)
[5]   Differential expression in SAGE: accounting for normal between-library variation [J].
Baggerly, KA ;
Deng, L ;
Morris, JS ;
Aldaz, CM .
BIOINFORMATICS, 2003, 19 (12) :1477-1483
[6]   Toward the $1000 human genome [J].
Bennett, ST ;
Barnes, C ;
Cox, A ;
Davies, L ;
Brown, C .
PHARMACOGENOMICS, 2005, 6 (04) :373-382
[7]   Cutting edge: Activation of NK cell-mediated cytotoxicity by a SAP-independent receptor of the CD2 family [J].
Bouchon, A ;
Cella, M ;
Grierson, HL ;
Cohen, JI ;
Colonna, M .
JOURNAL OF IMMUNOLOGY, 2001, 167 (10) :5517-5521
[8]   Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments [J].
Bullard, James H. ;
Purdom, Elizabeth ;
Hansen, Kasper D. ;
Dudoit, Sandrine .
BMC BIOINFORMATICS, 2010, 11
[9]   A Bayesian mixture model for differential gene expression [J].
Do, KA ;
Müller, P ;
Tang, F .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES C-APPLIED STATISTICS, 2005, 54 :627-644
[10]   THE USE OF A MIXTURE MODEL IN THE ANALYSIS OF COUNT DATA [J].
FAREWELL, VT ;
SPROTT, DA .
BIOMETRICS, 1988, 44 (04) :1191-1194