How prevalent is overfitting of regression models? A survey of recent articles in three psychology journals

被引:5
作者
Dalicandro, Lauren [1 ]
Harder, Jane A. [1 ]
Mazmanian, Dwight [1 ]
Weaver, Bruce [1 ,2 ]
机构
[1] Lakehead Univ, Thunder Bay, ON, Canada
[2] Northern Ontario Sch Med, Sudbury, ON, Canada
来源
QUANTITATIVE METHODS FOR PSYCHOLOGY | 2021年 / 17卷 / 01期
关键词
overfitting; replication crisis; regression; statistical best practices;
D O I
10.20982/tqmp.17.1.p001
中图分类号
C [社会科学总论];
学科分类号
03 ; 0303 ;
摘要
Since 2011, there has been much discussion and concern about a "replication crisis" in psychology. An inability to reproduce findings in new samples can undermine even basic tenets of psychology. Much attention has been paid to the following practices, which Bishop (2019) described as "the four horsemen of the reproducibility apocalypse": Publication bias, low statistical power, p-hacking (Simmons et al., 2011) and HARKing (i.e., hypothesizing after the results are known; Kerr, 1998). Another practice that has received less attention is overfitting of regression models. Babyak (2004) described overfitting as "capitalizing on the idiosyncratic characteristics of the sample at hand", and argued that it results in findings that "don't really exist in the population and hence will not replicate." The following common data-analytic practices increase the likelihood of model overfitting: Having too few observations (or events) per explanatory variable (OPV/EPV); automated algorithmic selection of variables; univariable pretesting of candidate predictor variables; categorization of quantitative variables; and sequential testing of multiple confounders. We reviewed 170 recent articles from three major psychology journals and found that 96 of them included at least one of the types of regression models Babyak (2004) discussed. We reviewed more fully those 96 articles and found that they reported 286 regression models. Regarding OPV/EPV, Babyak recommended 10 -15 as the minimum number needed to reduce the likelihood of overfitting. When we used the 10 OPV/EPV cut-off, 97 of the 286 models (33.9%) used at least one practice that leads to overfitting; and when we used 15 OPV/EPV as the cut-off, that number rose to 109 models (38.1%). The most frequently occurring practice that yields overfitted models was univariable pretesting of candidate predictor variables: It was found in 61 of the 286 models (21.3%). These findings suggest that overfitting of regression models remains a problem in psychology research, and that we must increase our efforts to educate researchers and students about this important issue.
引用
收藏
页码:1 / 6
页数:6
相关论文
共 14 条
[1]  
[Anonymous], 2014, BIOSTATISTICS BARE E
[2]  
[Anonymous], 2015, SCIENCE, DOI [DOI 10.1126/science.aac4716, DOI 10.1126/SCIENCE.AAC4716]
[3]   What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models [J].
Babyak, MA .
PSYCHOSOMATIC MEDICINE, 2004, 66 (03) :411-421
[4]   Feeling the Future: Experimental Evidence for Anomalous Retroactive Influences on Cognition and Affect [J].
Bem, Daryl J. .
JOURNAL OF PERSONALITY AND SOCIAL PSYCHOLOGY, 2011, 100 (03) :407-425
[5]   Rein in the four horsemen of irreproducibility [J].
Bishop, Dorothy .
NATURE, 2019, 568 (7753) :435-435
[6]   Multivariable Models in Biobehavioral Research [J].
Freedland, Kenneth E. ;
Reese, Rebecca L. ;
Steinmeyer, Brian C. .
PSYCHOSOMATIC MEDICINE, 2009, 71 (02) :205-216
[7]   The Statistical Crisis in Science [J].
Gelman, Andrew ;
Loken, Eric .
AMERICAN SCIENTIST, 2014, 102 (06) :460-465
[8]  
Kerr N L, 1998, Pers Soc Psychol Rev, V2, P196, DOI 10.1207/s15327957pspr0203_4
[9]   Performing high-powered studies efficiently with sequential analyses [J].
Lakens, Daniel .
EUROPEAN JOURNAL OF SOCIAL PSYCHOLOGY, 2014, 44 (07) :701-710
[10]   False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant [J].
Simmons, Joseph P. ;
Nelson, Leif D. ;
Simonsohn, Uri .
PSYCHOLOGICAL SCIENCE, 2011, 22 (11) :1359-1366