Use of the p-values as a size-dependent function to address practical differences when analyzing large datasets

被引:49
作者
Gomez-de-Mariscal, Estibaliz [1 ,2 ]
Guerrero, Vanesa [3 ]
Sneider, Alexandra [4 ]
Jayatilaka, Hasini [5 ]
Phillip, Jude M. [6 ]
Wirtz, Denis [4 ,7 ]
Munoz-Barrutia, Arrate [1 ,2 ]
机构
[1] Univ Carlos III Madrid, Bioengn & Aerosp Engn Dept, Leganes 28911, Spain
[2] Inst Invest Sanitaria Gregorio Maranon, Madrid 28007, Spain
[3] Univ Carlos III Madrid, Stat Dept, Getafe 28903, Spain
[4] Johns Hopkins Univ, Inst Nanobiotechnol, Dept Chem & Biomol Engn, Baltimore, MD 21218 USA
[5] AtlasXomics Inc, New Haven, CT 06511 USA
[6] Johns Hopkins Univ, Dept Biomed Engn, Baltimore, MD 21218 USA
[7] Johns Hopkins Univ, Sch Med, Dept Oncol, Baltimore, MD 21205 USA
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
RANDOM-VARIABLES;
D O I
10.1038/s41598-021-00199-5
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Biomedical research has come to rely on p-values as a deterministic measure for data-driven decision-making. In the largely extended null hypothesis significance testing for identifying statistically significant differences among groups of observations, a single p-value is computed from sample data. Then, it is routinely compared with a threshold, commonly set to 0.05, to assess the evidence against the hypothesis of having non-significant differences among groups, or the null hypothesis. Because the estimated p-value tends to decrease when the sample size is increased, applying this methodology to datasets with large sample sizes results in the rejection of the null hypothesis, making it not meaningful in this specific situation. We propose a new approach to detect differences based on the dependence of the p-value on the sample size. We introduce new descriptive parameters that overcome the effect of the size in the p-value interpretation in the framework of datasets with large sample sizes, reducing the uncertainty in the decision about the existence of biological differences between the compared experiments. The methodology enables the graphical and quantitative characterization of the differences between the compared experiments guiding the researchers in the decision process. An in-depth study of the methodology is carried out on simulated and experimental data. Code availability at https://github.com/BIIG-.UC3M/pMoSS.
引用
收藏
页数:13
相关论文
共 28 条
[1]   P values and the search for significance [J].
Altman, Naomi ;
Krzywinski, Martin .
NATURE METHODS, 2017, 14 (01) :3-+
[2]   Retire statistical significance [J].
Amrhein, Valentin ;
Greenland, Sander ;
McShane, Blake .
NATURE, 2019, 567 (7748) :305-307
[3]  
[Anonymous], 1908, BIOMETRIKA, V6, P1
[4]   p-Curve and p-Hacking in Observational Research [J].
Bruns, Stephan B. ;
Ioannidis, John P. A. .
PLOS ONE, 2016, 11 (02)
[5]   Evolution of Reporting P Values in the Biomedical Literature, 1990-2015 [J].
Chavalarias, David ;
Wallach, Joshua David ;
Li, Alvin Ho Ting ;
Ioannidis, John P. A. .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2016, 315 (11) :1141-1148
[6]   ROBUST LOCALLY WEIGHTED REGRESSION AND SMOOTHING SCATTERPLOTS [J].
CLEVELAND, WS .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1979, 74 (368) :829-836
[7]   The upstrap [J].
Crainiceanu, Ciprian M. ;
Crainiceanu, Adina .
BIOSTATISTICS, 2020, 21 (02) :E164-E166
[8]  
Demortier L., 2007, THESIS
[9]   Valid P-Values Behave Exactly as They Should: Some Misleading Criticisms of P-Values and Their Resolution With S-Values [J].
Greenland, Sander .
AMERICAN STATISTICIAN, 2019, 73 :106-114
[10]   The fickle P value generates irreproducible results [J].
Halsey, Lewis G. ;
Curran-Everett, Douglas ;
Vowler, Sarah L. ;
Drummond, Gordon B. .
NATURE METHODS, 2015, 12 (03) :179-185