A Bayesian Model for Cross-Study Differential Gene Expression

被引:25
作者
Scharpf, Robert B.
Tjelmeland, Hakon [1 ]
Parmigiani, Giovanni [2 ,3 ]
Nobel, Andrew B. [4 ]
机构
[1] Norwegian Univ Sci & Technol, Dept Math Sci, NO-7491 Trondheim, Norway
[2] Johns Hopkins Univ, Johns Hopkins Bloomberg Sch Publ Hlth, Dept Biostat, Baltimore, MD 21205 USA
[3] Johns Hopkins Univ, Sidney Kimmel Comprehens Canc Ctr, Baltimore, MD 21205 USA
[4] Univ N Carolina, Dept Stat, Chapel Hill, NC 27599 USA
基金
美国国家科学基金会;
关键词
Bayesian hierarchical model; Bayesian meta-analysis; Differential expression; Gene expression; Multiple studies; MICROARRAY DATA; MOLECULAR CLASSIFICATION; MIXTURE MODEL; METAANALYSIS; PROFILES; ADENOCARCINOMA; NORMALIZATION; COMPUTATION; VALIDATION; CARCINOMAS;
D O I
10.1198/jasa.2009.ap07611
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this article we define a hierarchical Bayesian model for microarray expression data collected from several studies and use it to identify genes that show differential expression between two conditions. Key features include shrinkage across both gene.; and studies, and flexible modeling that allows for interactions between platforms and the estimated effect, as well as concordant and discordant differential expression across studies. We evaluate the performance of our model in a comprehensive Fashion, using both artificial data, and a "split-study" validation approach that provides an agnostic assessment of the model's behavior under both the null hypothesis and a realistic alternative. The simulation results from the artificial data demonstrate the advantages of the Bayesian model. Furthermore, the simulations provide guidelines for when the Bayesian model is most likely to be useful. Most notably, in small studies the Bayesian model generally outperforms other methods when evaluated based on several performance measures across a range of simulation parameters, with the differences diminishing for larger sample sizes in the individual Studies. The split-study validation illustrates appropriate shrinkage of the Bayesian model in the absence of platform, sample, and annotation differences that otherwise complicate experimental data analyses. Finally, we fit our model to four breast cancer studies using different technologies (cDNA and Affymetrix) to estimate differential expression in estrogen receptor-positive tumors versus estrogen receptor-negative tumors. Software and data for reproducing our analysis are available publicly.
引用
收藏
页码:1295 / 1310
页数:16
相关论文
共 50 条
  • [21] Hierarchical Bayesian model for analysis of gene expression data
    Rekaya, R.
    Zhang, W.
    JOURNAL OF DAIRY SCIENCE, 2005, 88 : 104 - 105
  • [22] A semi-parametric Bayesian model for unsupervised differential co-expression analysis
    Freudenberg, Johannes M.
    Sivaganesan, Siva
    Wagner, Michael
    Medvedovic, Mario
    BMC BIOINFORMATICS, 2010, 11
  • [23] Bayesian robust inference for differential gene expression in microarrays with multiple samples
    Gottardo, R
    Raftery, AE
    Yeung, KY
    Bumgarner, RE
    BIOMETRICS, 2006, 62 (01) : 10 - 18
  • [24] Empirical comparison of cross-platform normalization methods for gene expression data
    Rudy, Jason
    Valafar, Faramarz
    BMC BIOINFORMATICS, 2011, 12
  • [25] A Bayesian Model for Pooling Gene Expression Studies That Incorporates Co-Regulation Information
    Conlon, Erin M.
    Postier, Bradley L.
    Methe, Barbara A.
    Nevin, Kelly P.
    Lovley, Derek R.
    PLOS ONE, 2012, 7 (12):
  • [26] Improving cross-study prediction through addon batch effect adjustment or addon normalization
    Hornung, Roman
    Causeur, David
    Bernau, Christoph
    Boulesteix, Anne-Laure
    BIOINFORMATICS, 2017, 33 (03) : 397 - 404
  • [27] Implications of scale dependence for cross-study syntheses of biodiversity differences
    Spake, Rebecca
    Mori, Akira S.
    Beckmann, Michael
    Martin, Philip A.
    Christie, Alec P.
    Duguid, Marlyse C.
    Doncaster, C. Patrick
    ECOLOGY LETTERS, 2021, 24 (02) : 374 - 390
  • [28] Bayesian Fourier clustering of gene expression data
    Kim, Jaehee
    Kyung, Minjung
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2017, 46 (08) : 6475 - 6494
  • [29] A BAYESIAN MODEL AVERAGING APPROACH FOR OBSERVATIONAL GENE EXPRESSION STUDIES
    Zhou, Xi Kathy
    Liu, Fei
    Dannenberg, Andrew J.
    ANNALS OF APPLIED STATISTICS, 2012, 6 (02) : 497 - 520
  • [30] AN ADAPTIVELY WEIGHTED STATISTIC FOR DETECTING DIFFERENTIAL GENE EXPRESSION WHEN COMBINING MULTIPLE TRANSCRIPTOMIC STUDIES
    Li, Jia
    Tseng, George C.
    ANNALS OF APPLIED STATISTICS, 2011, 5 (2A) : 994 - 1019