Rediscovery rate estimation for assessing the validation of significant findings in high-throughput studies

被引:25
作者
Ganna, Andrea [1 ,2 ]
Lee, Donghwan [3 ]
Ingelsson, Erik [4 ]
Pawitan, Yudi [1 ]
机构
[1] Karolinska Inst, Dept Med Epidemiol & Biostat, Stockholm, Sweden
[2] Uppsala Univ, Mol Epidemiol & Sci Life Lab, Dept Med Sci, Uppsala, Sweden
[3] Ewha Womans Univ, Dept Stat, Seoul, South Korea
[4] Uppsala Univ, Dept Med Sci, Mol Epidemiol & Sci Life Lab, Mol Epidemiol, Uppsala, Sweden
基金
瑞典研究理事会;
关键词
statistical validation; rediscovery rate; false discovery rate; multiple testing; metabolomics; GENOME-WIDE ASSOCIATION; EFFECT SIZES; REPLICATION; BIOMARKERS;
D O I
10.1093/bib/bbu033
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
It is common and advised practice in biomedical research to validate experimental or observational findings in a population different from the one where the findings were initially assessed. This practice increases the generalizability of the results and decreases the likelihood of reporting false-positive findings. Validation becomes critical when dealing with high-throughput experiments, where the large number of tests increases the chance to observe false-positive results. In this article, we review common approaches to determine statistical thresholds for validation and describe the factors influencing the proportion of significant findings from a 'training' sample that are replicated in a 'validation' sample. We refer to this proportion as rediscovery rate (RDR). In high-throughput studies, the RDR is a function of false-positive rate and power in both the training and validation samples. We illustrate the application of the RDR using simulated data and real data examples from metabolomics experiments. We further describe an online tool to calculate the RDR using t-statistics. We foresee two main applications. First, if the validation study has not yet been collected, the RDR can be used to decide the optimal combination between the proportion of findings taken to validation and the size of the validation study. Secondly, if a validation study has already been done, the RDR estimated using the training data can be compared with the observed RDR from the validation data; hence, the success of the validation study can be assessed.
引用
收藏
页码:563 / 575
页数:13
相关论文
共 24 条
[1]  
[Anonymous], 1999, NAT GENET, V22, P1
[2]   Discovering Findings That Replicate From a Primary Study of High Dimension to a Follow-Up Study [J].
Bogomolov, Marina ;
Heller, Ruth .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2013, 108 (504) :1480-1492
[3]   Large Scale Non-targeted Metabolomic Profiling of Serum by Ultra Performance Liquid Chromatography-Mass Spectrometry (UPLC-MS) [J].
Broeckling, Corey D. ;
Heuberger, Adam L. ;
Prenni, Jessica E. .
JOVE-JOURNAL OF VISUALIZED EXPERIMENTS, 2013, (73) :e50242
[4]   DNA methylation and body-mass index: a genome-wide analysis [J].
Dick, Katherine J. ;
Nelson, Christopher P. ;
Tsaprouni, Loukia ;
Sandling, Johanna K. ;
Aissi, Dylan ;
Wahl, Simone ;
Meduri, Eshwar ;
Morange, Pierre-Emmanuel ;
Gagnon, France ;
Grallert, Harald ;
Waldenberger, Melanie ;
Peters, Annette ;
Erdmann, Jeanette ;
Hengstenberg, Christian ;
Cambien, Francois ;
Goodall, Alison H. ;
Ouwehand, Willem H. ;
Schunkert, Heribert ;
Thompson, John R. ;
Spector, Tim D. ;
Gieger, Christian ;
Tregout, David-Alexandre ;
Deloukas, Panos ;
Samani, Nilesh J. .
LANCET, 2014, 383 (9933) :1990-1998
[5]  
Ganna A, 2014, BIORXIV
[6]   Risk Prediction Measures for Case-Cohort and Nested Case-Control Designs: An Application to Cardiovascular Disease [J].
Ganna, Andrea ;
Reilly, Marie ;
de Faire, Ulf ;
Pedersen, Nancy ;
Magnusson, Patrik ;
Ingelsson, Erik .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2012, 175 (07) :715-724
[7]   Editorial: Once and again - Issues surrounding replication in genetic association studies [J].
Hirschhorn, JN ;
Altshuler, D .
JOURNAL OF CLINICAL ENDOCRINOLOGY & METABOLISM, 2002, 87 (10) :4438-4441
[8]   Comparison of Effect Sizes Associated With Biomarkers Reported in Highly Cited Individual Articles and in Subsequent Meta-analyses [J].
Ioannidis, John P. A. ;
Panagiotou, Orestis A. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2011, 305 (21) :2200-2210
[9]   GENOME-WIDE ASSOCIATION STUDIES Validating, augmenting and refining genome-wide association signals [J].
Ioannidis, John P. A. ;
Thomas, Gilles ;
Daly, Mark J. .
NATURE REVIEWS GENETICS, 2009, 10 (05) :318-329
[10]   Why most published research findings are false [J].
Ioannidis, JPA .
PLOS MEDICINE, 2005, 2 (08) :696-701