A Factor Model Approach to Multiple Testing Under Dependence

被引:97
作者
Friguet, Chloe [1 ]
Kloareg, Maela [1 ]
Causeur, David [1 ]
机构
[1] Agrocampus Appl Math Dept, F-35042 Rennes, France
关键词
Factor analysis; False discovery rate; Multiple-hypothesis testing; Nondiscovery rate; OPTIMAL DISCOVERY PROCEDURE; GENE-EXPRESSION PROFILES; FALSE DISCOVERIES; NUMBER;
D O I
10.1198/jasa.2009.tm08332
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The impact of dependence between individual test statistics is currently among the most discussed topics in the multiple testing of high-dimensional data literature, especially since Benjamini and Hochberg (1995) introduced the false discovery rate (FDR). Many papers have first focused on the impact of dependence on the control of the FDR. Sonic more recent works have investigated approaches that account for common information shared by all the variables to stabilize the distribution of the error rates. Similarly, we propose to model this sharing of information by a factor analysis structure for the conditional variance of the test statistics. It is shown that the variance of the number of false discoveries increases along with the fraction of common variance. Test statistics for general linear contrasts are deduced, taking advantage of the common factor structure to reduce the variance of the error rates, A conditional FDR estimate is proposed and the overall performance of multiple testing procedure is shown to be markedly improved, regarding the nondiscovery rate, with respect to classical procedures. The present methodology is also assessed by comparison with leading multiple testing methods.
引用
收藏
页码:1406 / 1415
页数:10
相关论文
共 29 条
[1]  
[Anonymous], 1993, Continuous Univariate Distributions, DOI DOI 10.1016/0167-9473(96)90015-8
[2]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]   Power of double-sampling tests for general linear hypotheses [J].
Causeur, David ;
Husson, Francois .
STATISTICS, 2008, 42 (02) :115-125
[5]   Correlation and large-scale simultaneous significance testing [J].
Efron, Bradley .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (477) :93-103
[6]   CONTROL OF THE MEAN NUMBER OF FALSE DISCOVERIES, BONFERRONI AND STABILITY OF MULTIPLE TESTING [J].
Gordon, Alexander ;
Glazko, Galina ;
Qiu, Xing ;
Yakovlev, Andrei .
ANNALS OF APPLIED STATISTICS, 2007, 1 (01) :179-190
[7]   Gene-expression profiles in hereditary breast cancer. [J].
Hedenfalk, I ;
Duggan, D ;
Chen, YD ;
Radmacher, M ;
Bittner, M ;
Simon, R ;
Meltzer, P ;
Gusterson, B ;
Esteller, M ;
Kallioniemi, OP ;
Wilfond, B ;
Borg, Å ;
Trent, J ;
Raffeld, M ;
Yakhini, Z ;
Ben-Dor, A ;
Dougherty, E ;
Kononen, J ;
Bubendorf, L ;
Fehrle, W ;
Pittaluga, S ;
Gruvberger, S ;
Loman, N ;
Johannsoson, O ;
Olsson, H ;
Sauter, G .
NEW ENGLAND JOURNAL OF MEDICINE, 2001, 344 (08) :539-548
[8]  
Hsu J., 1992, J GRAPHICAL COMPUTAT, V1, P151
[9]   On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles [J].
Kendziorski, CM ;
Newton, MA ;
Lan, H ;
Gould, MN .
STATISTICS IN MEDICINE, 2003, 22 (24) :3899-3914
[10]   Effects of dependence in high-dimensional multiple testing problems [J].
Kim, Kyung In ;
de Wiel, Mark A. van .
BMC BIOINFORMATICS, 2008, 9 (1)