Statistical Significance Versus Clinical Importance of Observed Effect Sizes: What Do P Values and Confidence Intervals Really Represent?

被引:132
作者
Schober, Patrick [1 ]
Bossers, Sebastiaan M. [1 ]
Schwarte, Lothar A. [1 ]
机构
[1] Vrije Univ Amsterdam, Med Ctr, Dept Anesthesiol, De Boelelaan 1117, NL-1081 HV Amsterdam, Netherlands
关键词
MAGIC MIRROR; WALL-WHICH; GUIDE; DESIGN; POWER;
D O I
10.1213/ANE.0000000000002798
中图分类号
R614 [麻醉学];
学科分类号
100217 ;
摘要
Effect size measures are used to quantify treatment effects or associations between variables. Such measures, of which >70 have been described in the literature, include unstandardized and standardized differences in means, risk differences, risk ratios, odds ratios, or correlations. While null hypothesis significance testing is the predominant approach to statistical inference on effect sizes, results of such tests are often misinterpreted, provide no information on the magnitude of the estimate, and tell us nothing about the clinically importance of an effect. Hence, researchers should not merely focus on statistical significance but should also report the observed effect size. However, all samples are to some degree affected by randomness, such that there is a certain uncertainty on how well the observed effect size represents the actual magnitude and direction of the effect in the population. Therefore, point estimates of effect sizes should be accompanied by the entire range of plausible values to quantify this uncertainty. This facilitates assessment of how large or small the observed effect could actually be in the population of interest, and hence how clinically important it could be. This tutorial reviews different effect size measures and describes how confidence intervals can be used to address not only the statistical significance but also the clinical significance of the observed effect or association. Moreover, we discuss what P values actually represent, and how they provide supplemental information about the significant versus nonsignificant dichotomy. This tutorial intentionally focuses on an intuitive explanation of concepts and interpretation of results, rather than on the underlying mathematical theory or concepts.
引用
收藏
页码:1068 / 1072
页数:5
相关论文
共 29 条
  • [1] Altman D., 1999, Practical Statistics for Medical Research, P152
  • [2] Why we need confidence intervals
    Altman, DG
    [J]. WORLD JOURNAL OF SURGERY, 2005, 29 (05) : 554 - 556
  • [3] ALTMAN DG, 1985, J ROY STAT SOC D-STA, V34, P125
  • [4] Guide for calculating and interpreting effect sizes and confidence intervals in intellectual and developmental disability research studies
    Dunst, Carl J.
    Hamby, Deborah W.
    [J]. JOURNAL OF INTELLECTUAL & DEVELOPMENTAL DISABILITY, 2012, 37 (02) : 89 - 99
  • [5] Ellis PD, 2010, ESSENTIAL GUIDE TO EFFECT SIZES: STATISTICAL POWER, META-ANALYSIS AND THE INTERPRETATION OF RESEARCH RESULTS, P3
  • [6] Local Insufflation of Warm Humidified CO2 Increases Open Wound and Core Temperature During Open Colon Surgery: A Randomized Clinical Trial
    Frey, Joana M.
    Janson, Martin
    Svanfeldt, Monika
    Svenarud, Peter K.
    van der Linden, Jan A.
    [J]. ANESTHESIA AND ANALGESIA, 2012, 115 (05) : 1204 - 1211
  • [7] Fritz CO, 2012, J EXP PSYCHOL GEN, V141, P2, DOI 10.1037/a0024338
  • [8] A dirty dozen:: Twelve P-value misconceptions
    Goodman, Steven
    [J]. SEMINARS IN HEMATOLOGY, 2008, 45 (03) : 135 - 140
  • [9] Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations
    Greenland, Sander
    Senn, Stephen J.
    Rothman, Kenneth J.
    Carlin, John B.
    Poole, Charles
    Goodman, Steven N.
    Altman, Douglas G.
    [J]. EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2016, 31 (04) : 337 - 350
  • [10] Understanding the effect size and its measures
    Ialongo, Cristiano
    [J]. BIOCHEMIA MEDICA, 2016, 26 (02) : 150 - 163