Variance of the number of false discoveries

被引:93
作者
Owen, AB [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
关键词
false discovery rate; microarrays; multiple comparisons; single-nucleotide polymorphisms; step-down;
D O I
10.1111/j.1467-9868.2005.00509.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In high throughput genomic work, a very large number d of hypotheses are tested based on n << d data samples. The large number of tests necessitates an adjustment for false discoveries in which a true null hypothesis was rejected. The expected number of false discoveries is easy to obtain. Dependences between the hypothesis tests greatly affect the variance of the number of false discoveries. Assuming that the tests are independent gives an inadequate variance formula. The paper presents a variance formula that takes account of the correlations between test statistics. That formula involves O(d(2)) correlations, and so a naive implementation has cost O(nd(2)). A method based on sampling pairs of tests allows the variance to be approximated at a cost that is independent of d.
引用
收藏
页码:411 / 426
页数:16
相关论文
共 16 条
[1]  
[Anonymous], 1994, Kendall's Advanced Theory of Statistics, Distribution theory
[2]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[3]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[4]  
COX D. R., 2000, Theoretical Statistics
[5]  
Finner H, 2002, ANN STAT, V30, P220
[6]   A stochastic process approach to false discovery control [J].
Genovese, C ;
Wasserman, L .
ANNALS OF STATISTICS, 2004, 32 (03) :1035-1061
[7]  
Ihaka R., 1996, J COMPUTATIONAL GRAP, V5, P299, DOI [10.1080/10618600.1996.10474713, 10.2307/1390807, DOI 10.1080/10618600.1996.10474713]
[8]  
JOHNSON NL, 1969, DISTRIBUTIONS STAT C, V2
[9]   Controlling the number of false discoveries: application to high-dimensional genomic data [J].
Korn, EL ;
Troendle, JF ;
McShane, LM ;
Simon, R .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2004, 124 (02) :379-398
[10]  
Lauritzen S. L., 1996, GRAPHICAL MODELS