The Quality Preserving Database: A Computational Framework for Encouraging Collaboration, Enhancing Power and Controlling False Discovery

被引:5
作者
Aharoni, Ehud [1 ]
Neuvirth, Hani [1 ]
Rosset, Saharon [2 ]
机构
[1] IBM Res Lab Haifa, Machine Learning & Data Min Grp, IL-31905 Haifa, Israel
[2] Tel Aviv Univ, Sch Math Sci, IL-69978 Tel Aviv, Israel
关键词
Family-wise error rate; multiple comparisons; Bonferroni method;
D O I
10.1109/TCBB.2010.105
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The common scenario in computational biology in which a community of researchers conduct multiple statistical tests on one shared database gives rise to the multiple hypothesis testing problem. Conventional procedures for solving this problem control the probability of false discovery by sacrificing some of the power of the tests. We suggest a scheme for controlling false discovery without any power loss by adding new samples for each use of the database and charging the user with the expenses. The crux of the scheme is a carefully crafted pricing system that fairly prices different user requests based on their demands while keeping the probability of false discovery bounded. We demonstrate this idea in the context of HIV treatment research, where multiple researchers conduct tests on a repository of HIV samples.
引用
收藏
页码:1431 / 1437
页数:7
相关论文
共 15 条
[1]  
[Anonymous], 1987, Multiple comparison procedures
[2]  
[Anonymous], 2002, Probability and Statistics
[3]   The influenza virus resource at the national center for biotechnology information [J].
Bao, Yiming ;
Bolotov, Pavel ;
Dernovoy, Dmitry ;
Kiryutin, Boris ;
Zaslavsky, Leonid ;
Tatusova, Tatiana ;
Ostell, Jim ;
Lipman, David .
JOURNAL OF VIROLOGY, 2008, 82 (02) :596-601
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]   Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls [J].
Burton, Paul R. ;
Clayton, David G. ;
Cardon, Lon R. ;
Craddock, Nick ;
Deloukas, Panos ;
Duncanson, Audrey ;
Kwiatkowski, Dominic P. ;
McCarthy, Mark I. ;
Ouwehand, Willem H. ;
Samani, Nilesh J. ;
Todd, John A. ;
Donnelly, Peter ;
Barrett, Jeffrey C. ;
Davison, Dan ;
Easton, Doug ;
Evans, David ;
Leung, Hin-Tak ;
Marchini, Jonathan L. ;
Morris, Andrew P. ;
Spencer, Chris C. A. ;
Tobin, Martin D. ;
Attwood, Antony P. ;
Boorman, James P. ;
Cant, Barbara ;
Everson, Ursula ;
Hussey, Judith M. ;
Jolley, Jennifer D. ;
Knight, Alexandra S. ;
Koch, Kerstin ;
Meech, Elizabeth ;
Nutland, Sarah ;
Prowse, Christopher V. ;
Stevens, Helen E. ;
Taylor, Niall C. ;
Walters, Graham R. ;
Walker, Neil M. ;
Watkins, Nicholas A. ;
Winzer, Thilo ;
Jones, Richard W. ;
McArdle, Wendy L. ;
Ring, Susan M. ;
Strachan, David P. ;
Pembrey, Marcus ;
Breen, Gerome ;
St Clair, David ;
Caesar, Sian ;
Gordon-Smith, Katherine ;
Jones, Lisa ;
Fraser, Christine ;
Green, Elain K. .
NATURE, 2007, 447 (7145) :661-678
[6]  
Cover T.M., 2006, ELEMENTS INFORM THEO, V2nd ed
[7]   α-investing:: a procedure for sequential control of expected false discoveries [J].
Foster, Dean P. ;
Stine, Robert A. .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :429-444
[8]   USING U-STATISTICS TO DERIVE THE ASYMPTOTIC-DISTRIBUTION OF FISHER Z-STATISTIC [J].
HAWKINS, DL .
AMERICAN STATISTICIAN, 1989, 43 (04) :235-237
[9]   Why most published research findings are false [J].
Ioannidis, JPA .
PLOS MEDICINE, 2005, 2 (08) :696-701
[10]   Most published research findings are false- but a little replication goes a long way [J].
Moonesinghe, Ramal ;
Khoury, Muin J. ;
Janssens, A. Cecile J. W. .
PLOS MEDICINE, 2007, 4 (02) :218-221