AdaPT: an interactive procedure for multiple testing with side information

被引:91
作者
Lei, Lihua [1 ]
Fithian, William [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
关键词
Adaptive inference; False discovery rate; Martingales; Multiple testing; p-value weighting; Selective inference; FALSE DISCOVERY RATE; INCREASES DETECTION POWER; MIXTURE MODEL; MICROARRAY; INFERENCE;
D O I
10.1111/rssb.12274
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of multiple-hypothesis testing with generic side information: for each hypothesis H-i we observe both a p-value p(i) and some predictor x(i) encoding contextual information about the hypothesis. For large-scale problems, adaptively focusing power on the more promising hypotheses (those more likely to yield discoveries) can lead to much more powerful multiple-testing procedures. We propose a general iterative framework for this problem, the adaptive p-value thresholding procedure which we call AdaPT, which adaptively estimates a Bayes optimal p-value rejection threshold and controls the false discovery rate in finite samples. At each iteration of the procedure, the analyst proposes a rejection threshold and observes partially censored p-values, estimates the false discovery proportion below the threshold and proposes another threshold, until the estimated false discovery proportion is below alpha. Our procedure is adaptive in an unusually strong sense, permitting the analyst to use any statistical or machine learning method she chooses to estimate the optimal threshold, and to switch between different models at each iteration as information accrues. We demonstrate the favourable performance of AdaPT by comparing it with state of the art methods in five real applications and two simulation studies.
引用
收藏
页码:649 / 679
页数:31
相关论文
共 52 条
[21]   UNSUPERVISED EMPIRICAL BAYESIAN MULTIPLE TESTING WITH EXTERNAL COVARIATES [J].
Ferkingstad, Egil ;
Frigessi, Arnoldo ;
Rue, Havard ;
Thorleifsson, Gudmar ;
Kong, Augustine .
ANNALS OF APPLIED STATISTICS, 2008, 2 (02) :714-735
[22]   Genome-Wide Scan Informed by Age-Related Disease Identifies Loci for Exceptional Human Longevity [J].
Fortney, Kristen ;
Dobriban, Edgar ;
Garagnani, Paolo ;
Pirazzini, Chiara ;
Monti, Daniela ;
Mari, Daniela ;
Atzmon, Gil ;
Barzilai, Nir ;
Franceschi, Claudio ;
Owen, Art B. ;
Kim, Stuart K. .
PLOS GENETICS, 2015, 11 (12)
[23]   ReCount: A multi-experiment resource of analysis-ready RNA-seq gene count datasets [J].
Frazee, Alyssa C. ;
Langmead, Ben ;
Leek, Jeffrey T. .
BMC BIOINFORMATICS, 2011, 12
[24]   Sequential selection procedures and false discovery rate control [J].
G'Sell, Max Grazier ;
Wager, Stefan ;
Chouldechova, Alexandra ;
Tibshirani, Robert .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2016, 78 (02) :423-444
[25]   False discovery control with p-value weighting [J].
Genovese, Christopher R. ;
Roeder, Kathryn ;
Wasserman, Larry .
BIOMETRIKA, 2006, 93 (03) :509-524
[26]  
Gentleman R., 2016, genefilter: genefilter: methods for filtering genes from high-throughput experiments, V1.54.2
[27]   RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells [J].
Himes, Blanca E. ;
Jiang, Xiaofeng ;
Wagner, Peter ;
Hu, Ruoxi ;
Wang, Qiyu ;
Klanderman, Barbara ;
Whitaker, Reid M. ;
Duan, Qingling ;
Lasky-Su, Jessica ;
Nikolos, Christina ;
Jester, William ;
Johnson, Martin ;
Panettieri, Reynold A., Jr. ;
Tantisira, Kelan G. ;
Weiss, Scott T. ;
Lu, Quan .
PLOS ONE, 2014, 9 (06)
[28]   False Discovery Rate Control With Groups [J].
Hu, James X. ;
Zhao, Hongyu ;
Zhou, Harrison H. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (491) :1215-1227
[29]  
Huber W., 2016, GENOME RES
[30]  
Ignatiadis N., 2017, PREPRINT ARXIV 1701