Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation

被引:26
作者
Brinster, Regina [1 ,2 ,3 ]
Koettgen, Anna [4 ,5 ]
Tayo, Bamidele O. [6 ]
Schumacher, Martin [2 ,3 ]
Sekula, Peggy [4 ,5 ]
机构
[1] Heidelberg Univ, Inst Med Biometry & Informat, Neuenheimer Feld 130-3, D-69120 Heidelberg, Germany
[2] Univ Freiburg, Inst Med Biometry & Stat, Fac Med, Stefan Meier Str 26, D-79104 Freiburg, Germany
[3] Univ Freiburg, Med Ctr, Stefan Meier Str 26, D-79104 Freiburg, Germany
[4] Univ Freiburg, Fac Med, Inst Genet Epidemiol, Hugstetter Str 49, D-79106 Freiburg, Germany
[5] Univ Freiburg, Med Ctr, Hugstetter Str 49, D-79106 Freiburg, Germany
[6] Loyola Univ Chicago, Dept Publ Hlth Sci, Stritch Sch Med, Maywood, IL USA
关键词
False discovery rate; Simulation study; Low-dimensional setting; Q-value method; REJECTIVE MULTIPLE TEST; ASSOCIATION; TESTS;
D O I
10.1186/s12859-018-2081-x
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: When many (up to millions) of statistical tests are conducted in discovery set analyses such as genome-wide association studies (GWAS), approaches controlling family-wise error rate (FWER) or false discovery rate (FDR) are required to reduce the number of false positive decisions. Some methods were specifically developed in the context of high-dimensional settings and partially rely on the estimation of the proportion of true null hypotheses. However, these approaches are also applied in low-dimensional settings such as replication set analyses that might be restricted to a small number of specific hypotheses. The aim of this study was to compare different approaches in low-dimensional settings using (a) real data from the CKDGen Consortium and (b) a simulation study. Results: In both application and simulation FWER approaches were less powerful compared to FDR control methods, whether a larger number of hypotheses were tested or not. Most powerful was the q-value method. However, the specificity of this method to maintain true null hypotheses was especially decreased when the number of tested hypotheses was small. In this low-dimensional situation, estimation of the proportion of true null hypotheses was biased. Conclusions: The results highlight the importance of a sizeable data set for a reliable estimation of the proportion of true null hypotheses. Consequently, methods relying on this estimation should only be applied in high-dimensional settings. Furthermore, if the focus lies on testing of a small number of hypotheses such as in replication settings, FWER methods rather than FDR methods should be preferred to maintain high specificity.
引用
收藏
页数:10
相关论文
共 28 条
[1]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[2]  
[Anonymous], MODERN EPIDEMIOLOGY
[3]   A tutorial on statistical methods for population association studies [J].
Balding, David J. .
NATURE REVIEWS GENETICS, 2006, 7 (10) :781-791
[4]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[5]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[6]   Adaptive linear step-up procedures that control the false discovery rate [J].
Benjamini, Yoav ;
Krieger, Abba M. ;
Yekutieli, Daniel .
BIOMETRIKA, 2006, 93 (03) :491-507
[7]  
Blanchard G., 2015, MUTOSS UNIFIED MULTI
[8]   MULTIPLE SIGNIFICANCE TESTS - THE BONFERRONI METHOD .10. [J].
BLAND, JM ;
ALTMAN, DG .
BRITISH MEDICAL JOURNAL, 1995, 310 (6973) :170-170
[9]  
CKDGen Consortium, MET DAT
[10]   Empirical Bayes analysis of a microarray experiment [J].
Efron, B ;
Tibshirani, R ;
Storey, JD ;
Tusher, V .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1151-1160