The optimal discovery procedure: a new approach to simultaneous significance testing

被引:125
作者
Storey, John D. [1 ]
机构
[1] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
基金
美国国家卫生研究院;
关键词
classification; false discovery rate; multiple-hypothesis testing; optimal discovery procedure; q-value; single-thresholding procedure;
D O I
10.1111/j.1467-9868.2007.005592.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The Neyman-Pearson lemma provides a simple procedure for optimally testing a single hypothesis when the null and alternative distributions are known. This result has played a major role in the development of significance testing strategies that are used in practice. Most of the work extending single-testing strategies to multiple tests has focused on formulating and estimating new types of significance measures, such as the false discovery rate. These methods tend to be based on p-values that are calculated from each test individually, ignoring information from the other tests. I show here that one can improve the overall performance of multiple significance tests by borrowing information across all the tests when assessing the relative significance of each one, rather than calculating p-values for each test individually. The 'optimal discovery procedure' is introduced, which shows how to maximize the number of expected true positive results for each fixed number of expected false positive results. The optimality that is achieved by this procedure is shown to be closely related to optimality in terms of the false discovery rate. The optimal discovery procedure motivates a new approach to testing multiple hypotheses, especially when the tests are related. As a simple example, a new simultaneous procedure for testing several normal means is defined; this is surprisingly demonstrated to outperform the optimal single-test procedure, showing that a method which is optimal for single tests may no longer be optimal for multiple tests. Connections to other concepts in statistics are discussed, including Stein's paradox, shrinkage estimation and the Bayesian approach to hypothesis testing.
引用
收藏
页码:347 / 368
页数:22
相关论文
共 25 条
[1]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[2]   Adapting to unknown smoothness via wavelet shrinkage [J].
Donoho, DL ;
Johnstone, IM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1995, 90 (432) :1200-1224
[3]   EMPIRICAL BAYES ON VECTOR OBSERVATIONS - EXTENSION OF STEINS METHOD [J].
EFRON, B ;
MORRIS, C .
BIOMETRIKA, 1972, 59 (02) :335-347
[4]   Empirical Bayes analysis of a microarray experiment [J].
Efron, B ;
Tibshirani, R ;
Storey, JD ;
Tusher, V .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1151-1160
[5]   STEINS ESTIMATION RULE AND ITS COMPETITORS - EMPIRICAL BAYES APPROACH [J].
EFRON, B ;
MORRIS, C .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1973, 68 (341) :117-130
[6]   Operating characteristics and extensions of the false discovery rate procedure [J].
Genovese, C ;
Wasserman, L .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :499-517
[7]  
James W., 1961, Berkeley Symposium on Mathematical Statistics and Probability, V1, P361
[8]  
Lehmann E., 1986, TESTING STAT HYPOTHE
[9]   On optimality of stepdown and stepup multiple test procedures [J].
Lehmann, EL ;
Romano, JP ;
Shaffer, JP .
ANNALS OF STATISTICS, 2005, 33 (03) :1084-1108
[10]  
Lönnstedt I, 2002, STAT SINICA, V12, P31