Generalized α-investing: definitions, optimality results and application to public databases

被引:45
作者
Aharoni, Ehud [1 ]
Rosset, Saharon [2 ]
机构
[1] IBM Res Lab Haifa, IL-31905 Haifa, Israel
[2] Tel Aviv Univ, IL-69978 Tel Aviv, Israel
关键词
alpha-investing; alpha-spending; False discovery rate; Familywise error rate; Multiple comparisons; FALSE DISCOVERY RATE; ASSOCIATION;
D O I
10.1111/rssb.12048
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The increasing prevalence and utility of large public databases necessitates the development of appropriate methods for controlling false discovery. Motivated by this challenge, we discuss the generic problem of testing a possibly infinite stream of null hypotheses. In this context, Foster and Stine suggested a novel method named alpha-investing for controlling a false discovery measure known as mFDR. We develop a more general procedure for controlling mFDR, of which alpha-investing is a special case. We show that, in common practical situations, the general procedure can be optimized to produce an expected reward optimal version, which is more powerful than alpha-investing. We then present the concept of quality preserving databases which was originally introduced by Aharoni and co-workers, which formalizes efficient public database management to save costs and to control false discovery simultaneously. We show how one variant of generalized alpha-investing can be used to control mFDR in a quality preserving database and to lead to significant reduction in costs compared with naive approaches for controlling the familywise error rate implemented by Aharoni and co-workers.
引用
收藏
页码:771 / 794
页数:24
相关论文
共 18 条
[1]   The Quality Preserving Database: A Computational Framework for Encouraging Collaboration, Enhancing Power and Controlling False Discovery [J].
Aharoni, Ehud ;
Neuvirth, Hani ;
Rosset, Saharon .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2011, 8 (05) :1431-1437
[2]   The influenza virus resource at the national center for biotechnology information [J].
Bao, Yiming ;
Bolotov, Pavel ;
Dernovoy, Dmitry ;
Kiryutin, Boris ;
Zaslavsky, Leonid ;
Tatusova, Tatiana ;
Ostell, Jim ;
Lipman, David .
JOURNAL OF VIROLOGY, 2008, 82 (02) :596-601
[3]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[6]   Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls [J].
Craddock, Nick ;
Hurles, Matthew E. ;
Cardin, Niall ;
Pearson, Richard D. ;
Plagnol, Vincent ;
Robson, Samuel ;
Vukcevic, Damjan ;
Barnes, Chris ;
Conrad, Donald F. ;
Giannoulatou, Eleni ;
Holmes, Chris ;
Marchini, Jonathan L. ;
Stirrups, Kathy ;
Tobin, Martin D. ;
Wain, Louise V. ;
Yau, Chris ;
Aerts, Jan ;
Ahmad, Tariq ;
Andrews, T. Daniel ;
Arbury, Hazel ;
Attwood, Anthony ;
Auton, Adam ;
Ball, Stephen G. ;
Balmforth, Anthony J. ;
Barrett, Jeffrey C. ;
Barroso, Ines ;
Barton, Anne ;
Bennett, Amanda J. ;
Bhaskar, Sanjeev ;
Blaszczyk, Katarzyna ;
Bowes, John ;
Brand, Oliver J. ;
Braund, Peter S. ;
Bredin, Francesca ;
Breen, Gerome ;
Brown, Morris J. ;
Bruce, Ian N. ;
Bull, Jaswinder ;
Burren, Oliver S. ;
Burton, John ;
Byrnes, Jake ;
Caesar, Sian ;
Clee, Chris M. ;
Coffey, Alison J. ;
Connell, John M. C. ;
Cooper, Jason D. ;
Dominiczak, Anna F. ;
Downes, Kate ;
Drummond, Hazel E. ;
Dudakia, Darshna .
NATURE, 2010, 464 (7289) :713-U86
[7]   α-investing:: a procedure for sequential control of expected false discoveries [J].
Foster, Dean P. ;
Stine, Robert A. .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :429-444
[8]   CURRENT CONCEPTS Genomewide Association Studies and Human Disease [J].
Hardy, John ;
Singleton, Andrew .
NEW ENGLAND JOURNAL OF MEDICINE, 2009, 360 (17) :1759-1768
[9]   Genome-wide association studies for common diseases and complex traits [J].
Hirschhorn, JN ;
Daly, MJ .
NATURE REVIEWS GENETICS, 2005, 6 (02) :95-108
[10]   Why most published research findings are false [J].
Ioannidis, JPA .
PLOS MEDICINE, 2005, 2 (08) :696-701