Outlier Exclusion Procedures Must Be Blind to the Researcher's Hypothesis

被引:24
作者
Andre, Quentin [1 ]
机构
[1] Univ Colorado, Leeds Sch Business, 995 Regent Dr, Boulder, CO 80302 USA
关键词
outliers; false-positive; methodology; statistical analysis; boxplot;
D O I
10.1037/xge0001069
中图分类号
B84 [心理学];
学科分类号
04 ; 0402 ;
摘要
When researchers choose to identify and exclude outliers from their data, should they do so across all the data, or within experimental conditions? A survey of recent papers published in the Journal of Experimental Psychology: General shows that both methods are widely used, and common data visualization techniques suggest that outliers should be excluded at the condition-level. However, I highlight in the present paper that removing outliers by condition runs against the logic of hypothesis testing, and that this practice leads to unacceptable increases in false-positive rates. I demonstrate that this conclusion holds true across a variety of statistical tests, exclusion criterion and cutoffs, sample sizes, and data types, and shows in simulated experiments and in a reanalysis of existing data that by-condition exclusions can result in false-positive rates as high as 43%. I finally demonstrate that by-condition exclusions are a specific case of a more general issue: Any outlier exclusion procedure that is not blind to the hypothesis that researchers want to test may result in inflated Type I errors. I conclude by offering best practices and recommendations for excluding outliers.
引用
收藏
页码:213 / 223
页数:11
相关论文
共 34 条
[1]   Best-Practice Recommendations for Defining, Identifying, and Handling Outliers [J].
Aguinis, Herman ;
Gottfredson, Ryan K. ;
Joo, Harry .
ORGANIZATIONAL RESEARCH METHODS, 2013, 16 (02) :270-301
[2]  
Barnett V., 1994, Outliers in Statistical Data, V3rd ed, DOI DOI 10.1002/BIMJ.4710370219
[3]   Breaking Bread Produces Bigger Pies: An Empirical Extension of Shared Eating to Negotiations and a Commentary on Woolley and Fishbach (2019) [J].
Cao, Jiyin ;
Kong, Dejun Tony ;
Galinsky, Adam D. .
PSYCHOLOGICAL SCIENCE, 2020, 31 (10) :1340-1345
[4]  
Cohen J., 2003, APPL MULTIPLE REGRES, DOI 10.1007/978-1-59745-530-5_9
[5]  
Cousineau D, 2010, INT J PSYCHOL RES, V3, P58
[6]   Modern Robust Statistical Methods An Easy Way to Maximize the Accuracy and Power of Your Research [J].
Erceg-Hurn, David M. ;
Mirosevich, Vikki M. .
AMERICAN PSYCHOLOGIST, 2008, 63 (07) :591-601
[7]  
Gelman A., 2013, Bayesian data analysis
[8]  
Ghosh D., 2012, Joint Statistical Meetings, P3455
[9]  
Hawkins D.M, 1980, IDENTIFICATION OUTLI, V11
[10]  
Judd C.M., 2017, Data analysis: A model comparison approach, DOI 10.4324/9781315744131