A Comparative Study of Outlier Detection Procedures in Multiple Linear Regression

被引:0
作者
Ampanthong, Pimpan [1 ]
Suwattee, Prachoom [1 ]
机构
[1] Natl Inst Dev Adm, Sch Appl Stat, Bangkok, Thailand
来源
IMECS 2009: INTERNATIONAL MULTI-CONFERENCE OF ENGINEERS AND COMPUTER SCIENTISTS, VOLS I AND II | 2009年
关键词
Multiple linear regression; Outliers; Outlier detection; Residuals; MULTIVARIATE LOCATION; HIGH-BREAKDOWN; ROBUST;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Outlier detection methods in multiple linear regression are reviewed. Eight statistics for outlier detection have been investigated and compared. It is found from Monte Carlo simulation that Mahalanobis distance (MDi) identifiers the presence of outliers more often than the others for small, medium and large sample sizes with different percentages outliers in the regressors and in both the regressors; and the dependent variable. The next best statistics for the detection are Hat matrix (h(ii)) Cook's square distance (CDi) and DEFFITi distance. As for the dependent variable outlier, Cook's square distance (CDi) and PRESS residual (r((i))) perform better than the others.
引用
收藏
页码:704 / 709
页数:6
相关论文
共 27 条
[21]   LEAST MEDIAN OF SQUARES REGRESSION [J].
ROUSSEEUW, PJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1984, 79 (388) :871-880
[22]  
Ryan TP., 1997, Modern regression methods
[23]  
SEBERT DM, 1996, THESIS ARIZONA STATE
[24]  
Sen A.K., 1990, REGRESSION ANAL THEO
[25]   A Comparative analysis of multiple outlier detection procedures in the linear regression model [J].
Wisnowski, JW ;
Montgomery, DC ;
Simpson, JR .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2001, 36 (03) :351-382
[26]   COMPUTABLE ROBUST ESTIMATION OF MULTIVARIATE LOCATION AND SHAPE IN HIGH DIMENSION USING COMPOUND ESTIMATORS [J].
WOODRUFF, DL ;
ROCKE, DM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (427) :888-896
[27]   A Monte Carlo comparison of several high breakdown and efficient estimators [J].
You, JZ .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1999, 30 (02) :205-219