Robust Modeling of Differential Gene Expression Data Using Normal/Independent Distributions: A Bayesian Approach

被引:4
作者
Ganjali, Mojtaba [1 ,2 ]
Baghfalaki, Taban [1 ,3 ]
Berridge, Damon [4 ]
机构
[1] Inst Res Fundamental Sci IPM, Sch Biol Sci, Tehran, Iran
[2] Shahid Beheshti Univ, Fac Math Sci, Dept Stat, Tehran, Iran
[3] Tarbiat Modares Univ, Fac Math Sci, Dept Stat, Tehran, Iran
[4] Swansea Univ, Coll Med, Farr Inst CIPHER, Swansea, W Glam, Wales
来源
PLOS ONE | 2015年 / 10卷 / 04期
基金
美国国家科学基金会;
关键词
INFERENCE; IDENTIFICATION; MICROARRAYS;
D O I
10.1371/journal.pone.0123791
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
In this paper, the problem of identifying differentially expressed genes under different conditions using gene expression microarray data, in the presence of outliers, is discussed. For this purpose, the robust modeling of gene expression data using some powerful distributions known as normal/independent distributions is considered. These distributions include the Student's t and normal distributions which have been used previously, but also include extensions such as the slash, the contaminated normal and the Laplace distributions. The purpose of this paper is to identify differentially expressed genes by considering these distributional assumptions instead of the normal distribution. A Bayesian approach using the Markov Chain Monte Carlo method is adopted for parameter estimation. Two publicly available gene expression data sets are analyzed using the proposed approach. The use of the robust models for detecting differentially expressed genes is investigated. This investigation shows that the choice of model for differentiating gene expression data is very important. This is due to the small number of replicates for each gene and the existence of outlying data. Comparison of the performance of these models is made using different statistical criteria and the ROC curve. The method is illustrated using some simulation studies. We demonstrate the flexibility of these robust models in identifying differentially expressed genes.
引用
收藏
页数:19
相关论文
共 26 条
  • [1] A Bayesian framework for the analysis of microarray expression data: regularized t-test and statistical inferences of gene changes
    Baldi, P
    Long, AD
    [J]. BIOINFORMATICS, 2001, 17 (06) : 509 - 519
  • [2] A Laplace mixture model for identification of differential expression in microarray experiments
    Bhowmick, Debjani
    Davison, A. C.
    Goldstein, Darlene R.
    Ruffieux, Yann
    [J]. BIOSTATISTICS, 2006, 7 (04) : 630 - 641
  • [3] CASELLA G., 2002, STAT INFERENCE
  • [4] Chen Y, 1997, J Biomed Opt, V2, P364, DOI 10.1117/12.281504
  • [5] Chu G, 2002, SAM SIGNIFICANT ANAL
  • [6] Dudoit S, 2002, STAT SINICA, V12, P111
  • [7] Large-scale simultaneous hypothesis testing: The choice of a null hypothesis
    Efron, B
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (465) : 96 - 104
  • [8] An introduction to ROC analysis
    Fawcett, Tom
    [J]. PATTERN RECOGNITION LETTERS, 2006, 27 (08) : 861 - 874
  • [9] Gelman A., 1992, STAT SCI, V7, P457, DOI [DOI 10.1214/SS/1177011136, 10.1214/ss/1177011136]
  • [10] Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring
    Golub, TR
    Slonim, DK
    Tamayo, P
    Huard, C
    Gaasenbeek, M
    Mesirov, JP
    Coller, H
    Loh, ML
    Downing, JR
    Caligiuri, MA
    Bloomfield, CD
    Lander, ES
    [J]. SCIENCE, 1999, 286 (5439) : 531 - 537