Beyond Power Calculations: Assessing Type S (Sign) and Type M (Magnitude) Errors

被引:785
作者
Gelman, Andrew [1 ,2 ]
Carlin, John [3 ,4 ,5 ]
机构
[1] Columbia Univ, Dept Stat, New York, NY 10027 USA
[2] Columbia Univ, Dept Polit Sci, New York, NY 10027 USA
[3] Murdoch Childrens Res Inst, Clin Epidemiol & Biostat Unit, Parkville, Vic, Australia
[4] Univ Melbourne, Dept Paediat, Melbourne, Vic 3010, Australia
[5] Univ Melbourne, Sch Populat & Global Hlth, Melbourne, Vic 3010, Australia
基金
美国国家科学基金会;
关键词
design calculation; exaggeration ratio; power analysis; replication crisis; statistical significance; Type M error; Type S error; LIFE EXPECTANCY; SIZE;
D O I
10.1177/1745691614551642
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
Statistical power analysis provides the conventional approach to assess error rates when designing a research study. However, power analysis is flawed in that a narrow emphasis on statistical significance is placed as the primary focus of study design. In noisy, small-sample settings, statistically significant results can often be misleading. To help researchers address this problem in the context of their own studies, we recommend design calculations in which (a) the probability of an estimate being in the wrong direction (Type S [sign] error) and (b) the factor by which the magnitude of an effect might be overestimated (Type M [magnitude] error or exaggeration ratio) are estimated. We illustrate with examples from recent published research and discuss the largest challenge in a design calculation: coming up with reasonable estimates of plausible effect sizes based on external information.
引用
收藏
页码:641 / 651
页数:11
相关论文
共 31 条
  • [1] [Anonymous], 2012, GALL POLL
  • [2] Distinguishing true from false positives in genomic studies: p values
    Broer, Linda
    Lill, Christina M.
    Schuur, Maaike
    Amin, Najaf
    Roehr, Johannes T.
    Bertram, Lars
    Ioannidis, John P. A.
    van Duijn, Cornelia M.
    [J]. EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2013, 28 (02) : 131 - 138
  • [3] Power failure: why small sample size undermines the reliability of neuroscience
    Button, Katherine S.
    Ioannidis, John P. A.
    Mokrysz, Claire
    Nosek, Brian A.
    Flint, Jonathan
    Robinson, Emma S. J.
    Munafo, Marcus R.
    [J]. NATURE REVIEWS NEUROSCIENCE, 2013, 14 (05) : 365 - 376
  • [4] Evidence on the impact of sustained exposure to air pollution on life expectancy from China's Huai River policy
    Chen, Yuyu
    Ebenstein, Avraham
    Greenstone, Michael
    Li, Hongbin
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (32) : 12936 - 12941
  • [5] Cohen J., 1988, Statistical power analysis for the behavioral sciences, VSecond
  • [6] The Fluctuating Female Vote: Politics, Religion, and the Ovulatory Cycle
    Durante, Kristina M.
    Rae, Ashley
    Griskevicius, Vladas
    [J]. PSYCHOLOGICAL SCIENCE, 2013, 24 (06) : 1007 - 1016
  • [7] REEXAMINING THE MINIMAL EFFECTS MODEL IN RECENT PRESIDENTIAL CAMPAIGNS
    FINKEL, SE
    [J]. JOURNAL OF POLITICS, 1993, 55 (01) : 1 - 21
  • [8] Froehlich G W, 1999, Eff Clin Pract, V2, P234
  • [9] An Examination of Stereotype Threat Effects on Girls' Mathematics Performance
    Ganley, Colleen M.
    Mingle, Leigh A.
    Ryan, Allison M.
    Ryan, Katherine
    Vasilyeva, Marina
    Perry, Michelle
    [J]. DEVELOPMENTAL PSYCHOLOGY, 2013, 49 (10) : 1886 - 1897
  • [10] Type S error rates for classical and Bayesian single and multiple comparison procedures
    Gelman, A
    Tuerlinckx, FA
    [J]. COMPUTATIONAL STATISTICS, 2000, 15 (03) : 373 - 390