Costs and Benefits of Popular P-Value Correction Methods in Three Models of Quantitative Omic Experiments

被引:18
作者
Shuken, Steven R. [3 ,4 ,5 ]
McNerney, M. Windy [1 ,2 ]
机构
[1] Vet Affairs Palo Alto Hlth Care Syst, Mental Illness Res Educ & Clin Ctr MIRECC, Palo Alto, CA 94304 USA
[2] Stanford Univ, Dept Psychiat & Behav Sci, Sch Med, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Chem, Stanford 94305, CA USA
[4] Stanford Univ, Sch Med, Dept Neurol & Neurol Sci, Stanford, CA 94305 USA
[5] Stanford Univ, Wu Tsai Neurosci Inst, Stanford, CA 94305 USA
关键词
Compendex;
D O I
10.1021/acs.analchem.2c03719
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
The multiple hypothesis testing problem is inherent in large-scale quantitative "omic" experiments such as mass spectrometry-based proteomics. Yet, tools for comparing the costs and benefits of different p-value correction methods under different experimental conditions are lacking. We performed thousands of simulations of omic experiments under a range of experimental conditions and applied correction using the Benjamini-Hochberg (BH), Bonferroni, and permutation-based false discovery proportion (FDP) estimation methods. The tremendous false discovery rate (FDR) benefit of correction was confirmed in a range of different contexts. No correction method can guarantee a low FDP in a single experiment, but the probability of a high FDP is small when a high number and proportion of corrected p-values are significant. On average, correction decreased sensitivity, but the sensitivity costs of BH and permutation were generally modest compared to the FDR benefits. In a given experiment, observed sensitivity was always maintained or decreased by BH and Bonferroni, whereas it was often increased by permutation. Overall, permutation had better FDR and sensitivity than BH. We show how increasing sample size, decreasing variability, or increasing effect size can enable the detection of all true changes while still correcting p-values, and we present basic guidelines for omic experimental design. Analysis of an experimental proteomic data set with defined changes corroborated these trends. We developed an R Shiny web application for further exploration and visualization of these models, which we call the Simulator of P-value Multiple Hypothesis Correction (SIMPLYCORRECT) and a highperformance R package, permFDP, for easy use of the permutation-based FDP estimation method.
引用
收藏
页码:2732 / 2740
页数:9
相关论文
共 22 条
  • [1] [Anonymous], 2007, MATH STAT DATA ANAL
  • [2] BLAME IT ON THE ANTIBODIES
    Baker, Monya
    [J]. NATURE, 2015, 521 (7552) : 274 - 276
  • [3] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [4] Control procedures and estimators of the false discovery rate and their application in low-dimensional settings: an empirical investigation
    Brinster, Regina
    Koettgen, Anna
    Tayo, Bamidele O.
    Schumacher, Martin
    Sekula, Peggy
    [J]. BMC BIOINFORMATICS, 2018, 19
  • [5] Panning for gold: "model-X' knockoffs for high dimensional controlled variable selection
    Candes, Emmanuel
    Fan, Yingying
    Janson, Lucas
    Lv, Jinchi
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2018, 80 (03) : 551 - 577
  • [6] Eddelbuettel D., BOOST C PLUS PLUS HE
  • [7] Eddelbuettel D., 2018, AM SATISTICIAN, P72
  • [8] Eddelbuettel D, 2011, J STAT SOFTW, V40, P1
  • [9] Transgenic Mouse Models of Alzheimer's Disease
    Elder, Gregory A.
    Sosa, Miguel A. Gama
    De Gasperi, Rita
    [J]. MOUNT SINAI JOURNAL OF MEDICINE, 2010, 77 (01): : 69 - 81
  • [10] Fithian W, 2020, Arxiv, DOI arXiv:2007.10438