Multivariate Outlier Detection With High-Breakdown Estimators

被引:85
作者
Cerioli, Andrea [1 ]
机构
[1] Univ Parma, Dipartimento Econ, Sez Stat & Informat, I-43100 Parma, Italy
关键词
Minimum covariance determinant estimator; Multiple outliers; Reweighting; Robust distance; Size and power; ASYMPTOTICS; LOCATION;
D O I
10.1198/jasa.2009.tm09147
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper we develop multivariate outlier tests based on the high-breakdown Minimum Covariance Determinant estimator The rules that we propose have good performance under the null hypothesis of no outliers in the data and also appreciable power properties for the purpose of individual outlier detection This achievement is made possible by two orders of improvement over the currently available methodology First we suggest an approximation to the exact distribution of robust distances flour which cut-off values can be obtained even in small samples Our thresholds are accurate simple to implement and result in more powerful outlier identification rules than those obtained by calibrating the asymptotic distribution of distances The second power improvement comes from the addition of a new iteration step after one-step reweighting of the estimator The proposed methodology is motivated by asymptotic distributional results Its finite sample performance is evaluated through simulations and compared to that of available multivariate outlier tests
引用
收藏
页码:147 / 156
页数:10
相关论文
共 31 条
[1]  
[Anonymous], 1987, Multiple comparison procedures
[2]  
Arsenis S., 2005, ENL INT WORKSH 2005
[3]  
Atkinson A.C., 2008, MINING MASSIVE DATA, V271
[4]  
Atkinson A.C., 2004, SPR S STAT
[5]  
Becker C, 1999, J AM STAT ASSOC, V94, P947
[6]   The largest nonidentifiable outlier: a comparison of multivariate simultaneous outlier identification rules [J].
Becker, C ;
Gather, U .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2001, 36 (01) :119-127
[7]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[8]   Robust Multivariate Tolerance Regions: Influence Function and Monte Carlo Study [J].
Boente, Graciela ;
Farall, Andres .
TECHNOMETRICS, 2008, 50 (04) :487-500
[9]  
Brettschneider J, 2008, TECHNOMETRICS, V50, P241, DOI 10.1198/004017008000000334
[10]   ASYMPTOTICS FOR THE MINIMUM COVARIANCE DETERMINANT ESTIMATOR [J].
BUTLER, RW ;
DAVIES, PL ;
JHUN, M .
ANNALS OF STATISTICS, 1993, 21 (03) :1385-1400