POWER-ENHANCED MULTIPLE DECISION FUNCTIONS CONTROLLING FAMILY-WISE ERROR AND FALSE DISCOVERY RATES

被引:29
作者
Pena, Edsel A. [1 ]
Habiger, Joshua D. [2 ]
Wu, Wensong [1 ]
机构
[1] Univ S Carolina, Dept Stat, Columbia, SC 29208 USA
[2] Oklahoma State Univ, Dept Stat, Stillwater, OK 74078 USA
关键词
Benjamini-Hochberg procedure; Bonferroni procedure; decision process; false discovery rate (FDR); family wise error rate (FWER); Lagrangian optimization; Neyman Pearson most powerful test; microarray analysis; reverse martingale; missed discovery rate (MDR); multiple decision function and process; multiple hypotheses testing; optional sampling theorem; power function; randomized p-values; generalized multiple decision p-values; ROC function; Sidak procedure; EMPIRICAL BAYES; P-VALUES; SIZE; NULL; PROPORTION; OPTIMALITY; TESTS;
D O I
10.1214/10-AOS844
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Improved procedures, in terms of smaller missed discovery rates (MDR), for performing multiple hypotheses testing with weak and strong control of the family-wise error rate (FWER) or the false discovery rate (FDR) are developed and studied. The improvement over existing procedures such as the Sidak procedure for FWER control and the Benjamini-Hochberg (BH) procedure for FDR control is achieved by exploiting possible differences in the powers of the individual tests. Results signal the need to take into account the powers of the individual tests and to have multiple hypotheses decision functions which are not limited to simply using the individual p-values, as is the case, for example, with the Sidak, Bonferroni, or BH procedures. They also enhance understanding of the role of the powers of individual tests, or more precisely the receiver operating characteristic (ROC) functions of decision processes, in the search for better multiple hypotheses testing procedures. A decision-theoretic framework is utilized, and through auxiliary randomizers the procedures could be used with discrete or mixed-type data or with rank-based nonparametric tests. This is in contrast to existing p-value based procedures whose theoretical validity is contingent on each of these p-value statistics being stochastically equal to or greater than a standard uniform variable under the null hypothesis. Proposed procedures are relevant in the analysis of high-dimensional "large M, small n" data sets arising in the natural, physical, medical, economic and social sciences, whose generation and creation is accelerated by advances in high-throughput technology, notably, but not limited to, microarray technology.
引用
收藏
页码:556 / 583
页数:28
相关论文
共 50 条
[1]  
[Anonymous], 1993, Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment
[2]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]  
Bonferroni C.E., 1936, Teoria Statistica Delle Classi e Calcolo Delle Probabilit, VVolume 8, P1, DOI DOI 10.4135/9781412961288.N455
[5]  
Cox D.R., 1974, THEORETICAL STAT
[6]   Multiple hypothesis testing in microarray experiments [J].
Dudoit, S ;
Shaffer, JP ;
Boldrick, JC .
STATISTICAL SCIENCE, 2003, 18 (01) :71-103
[7]  
DUDOIT S, 2007, RESAMPLING BASED EMP
[8]  
Dudoit S, 2008, SPRINGER SER STAT, P1
[9]   Large-scale simultaneous hypothesis testing: The choice of a null hypothesis [J].
Efron, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2004, 99 (465) :96-104
[10]   Empirical Bayes analysis of a microarray experiment [J].
Efron, B ;
Tibshirani, R ;
Storey, JD ;
Tusher, V .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1151-1160