Estimation of false discovery rates in multiple testing: Application to gene microarray data

被引:111
作者
Tsai, CA [1 ]
Hsueh, HM
Chen, JJ
机构
[1] US FDA, Natl Ctr Toxicol Res, Div Biometry & Risk Assessment, Jefferson, AR 72079 USA
[2] Natl Chengchi Univ, Dept Stat, Taipei 11623, Taiwan
关键词
Bayesian Type I error; comparison-wise error rate (CWE); false discovery rate (FDR); number of rejections; number of true null hypotheses; q-value;
D O I
10.1111/j.0006-341X.2003.00123.x
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Testing for significance with gene expression data from DNA microarray experiments involves simultaneous comparisons of hundreds or thousands of genes. If R denotes the number of rejections (declared significant genes) and V denotes the number of false rejections, then VIR, if R > 0, is the proportion of false rejected hypotheses. This paper proposes a model for the distribution of the number of rejections and the conditional distribution of V given R, V/ R. Under the independence assumption, the distribution of R is a convolution of two binomials and the distribution of V/R has a noncentral hypergeometric distribution. Under an equicorrelated model, the distributions are more complex and are also derived. Five false discovery rate probability error measures are considered: FDR = E(V/R), pFDR = E(V/R/R > 0) (positive FDR), cFDR = E(V/R\R = r) (conditional FDR), mFDR = E(V)/E(R) (marginal FDR), and eFDR = E(V)/r (empirical FDR). The pFDR, cFDR, and mFDR are shown to be equivalent under the Bayesian framework, in which the number of true null hypotheses is modeled as a random variable. We present a parametric and a bootstrap procedure to estimate the FDRs. Monte Carlo simulations were conducted to evaluate the performance of these two methods. The bootstrap procedure appears to perform reasonably well, even when the alternative hypotheses are correlated (rho=.25). An example from a toxicogenomic microarray experiment is presented for illustration.
引用
收藏
页码:1071 / 1081
页数:11
相关论文
共 20 条
[11]  
Kerr MK, 2002, STAT SINICA, V12, P203
[12]   A modified Benjamini-Hochberg multiple comparisons procedure for controlling the false discovery rate [J].
Kwong, KS ;
Holland, B ;
Cheung, SH .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2002, 104 (02) :351-362
[13]   MULTIPLE COMPARISON PROCEDURES - THE PRACTICAL SOLUTION [J].
SAVILLE, DJ .
AMERICAN STATISTICIAN, 1990, 44 (02) :174-180
[14]  
SCHWEDER T, 1982, BIOMETRIKA, V69, P493
[15]   A direct approach to false discovery rates [J].
Storey, JD .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :479-498
[16]   Significance analysis of microarrays applied to the ionizing radiation response [J].
Tusher, VG ;
Tibshirani, R ;
Chu, G .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2001, 98 (09) :5116-5121
[17]  
Weller JI, 1998, GENETICS, V150, P1699
[18]  
WESTFALL PH, 2001, BIOSTATISTICAL METHO, P43
[19]   Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics [J].
Yekutieli, D ;
Benjamini, Y .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 1999, 82 (1-2) :171-196
[20]  
Zaykin DV, 2000, GENETICS, V154, P1917