Controlling the Rate of GWAS False Discoveries

被引:79
作者
Brzyski, Damian [1 ,2 ]
Peterson, Christine B. [3 ]
Sobczyk, Piotr [4 ]
Candes, Emmanuel J. [5 ]
Bogdan, Malgorzata [7 ]
Sabatti, Chiara [6 ]
机构
[1] Jagiellonian Univ, Inst Math, PL-30348 Krakow, Poland
[2] Indiana Univ, Dept Epidemiol & Biostat, Bloomington, IN 47405 USA
[3] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
[4] Wroclaw Univ Sci & Technol, Fac Pure & Appl Math, PL-50370 Wroclaw, Poland
[5] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
[6] Stanford Univ, Dept Biomed Data Sci, Hlth Res & Policy Redwood Bldg, Stanford, CA 94305 USA
[7] Univ Wroclaw, Inst Math, PL-50384 Wroclaw, Poland
基金
美国国家卫生研究院;
关键词
association studies; multiple penalized regression; linkage disequilibrium; FDR; GENOME-WIDE ASSOCIATION; VARIABLE SELECTION; HERITABILITY; VARIANTS; LINKAGE; TRAITS; COMMON; LOCI; TOOL;
D O I
10.1534/genetics.116.193987
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study.
引用
收藏
页码:61 / 75
页数:15
相关论文
共 49 条
[1]   Adapting to unknown sparsity by controlling the false discovery rate [J].
Abramovich, Felix ;
Benjamini, Yoav ;
Donoho, David L. ;
Johnstone, Iain M. .
ANNALS OF STATISTICS, 2006, 34 (02) :584-653
[2]   Stability Selection for Genome-Wide Association [J].
Alexander, David H. ;
Lange, Kenneth .
GENETIC EPIDEMIOLOGY, 2011, 35 (07) :722-728
[3]  
[Anonymous], 2016, ARXIV PREPRINT ARXIV
[4]   The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans [J].
Ardlie, Kristin G. ;
DeLuca, David S. ;
Segre, Ayellet V. ;
Sullivan, Timothy J. ;
Young, Taylor R. ;
Gelfand, Ellen T. ;
Trowbridge, Casandra A. ;
Maller, Julian B. ;
Tukiainen, Taru ;
Lek, Monkol ;
Ward, Lucas D. ;
Kheradpour, Pouya ;
Iriarte, Benjamin ;
Meng, Yan ;
Palmer, Cameron D. ;
Esko, Tonu ;
Winckler, Wendy ;
Hirschhorn, Joel N. ;
Kellis, Manolis ;
MacArthur, Daniel G. ;
Getz, Gad ;
Shabalin, Andrey A. ;
Li, Gen ;
Zhou, Yi-Hui ;
Nobel, Andrew B. ;
Rusyn, Ivan ;
Wright, Fred A. ;
Lappalainen, Tuuli ;
Ferreira, Pedro G. ;
Ongen, Halit ;
Rivas, Manuel A. ;
Battle, Alexis ;
Mostafavi, Sara ;
Monlong, Jean ;
Sammeth, Michael ;
Mele, Marta ;
Reverter, Ferran ;
Goldmann, Jakob M. ;
Koller, Daphne ;
Guigo, Roderic ;
McCarthy, Mark I. ;
Dermitzakis, Emmanouil T. ;
Gamazon, Eric R. ;
Im, Hae Kyung ;
Konkashbaev, Anuar ;
Nicolae, Dan L. ;
Cox, Nancy J. ;
Flutre, Timothee ;
Wen, Xiaoquan ;
Stephens, Matthew .
SCIENCE, 2015, 348 (6235) :648-660
[5]   False discovery rate-adjusted multiple confidence intervals for selected parameters [J].
Benjamini, Y ;
Yekutieli, D .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2005, 100 (469) :71-81
[6]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[7]   Quantitative trait loci analysis using the false discovery rate [J].
Benjamini, Y ;
Yekutieli, D .
GENETICS, 2005, 171 (02) :783-789
[8]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[9]   False discovery rates for spatial signals [J].
Benjamini, Ybav ;
Heller, Ruth .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2007, 102 (480) :1272-1281
[10]   Selective inference on multiple families of hypotheses [J].
Benjamini, Yoav ;
Bogomolov, Marina .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2014, 76 (01) :297-318