Robust covariance matrix estimation and identification of unusual data points: New tools

被引:3
作者
Garciga, Christian [1 ]
Verbrugge, Randal [1 ,2 ]
机构
[1] Fed Reserve Bank Cleveland, 1455 E 6th St, Cleveland, OH 44114 USA
[2] NBER, CRIW, 1455 E 6th St, Cleveland, OH 44114 USA
关键词
Outlier identification; Fragility; Robust estimation; detMCD; RMVN; VARIANCE-ESTIMATION NNVE; FAST-FOOD INDUSTRY; OUTLIER DETECTION; MINIMUM-WAGES; NEW-JERSEY; REGRESSION; EMPLOYMENT; PENNSYLVANIA; EFFICIENCY; ALGORITHM;
D O I
10.1016/j.rie.2021.03.001
中图分类号
F [经济];
学科分类号
02 ;
摘要
Most consistent estimators are prone to total breakdown in the presence of a handful of unusual data points (UDPs). This compromises inference. Robust estimation is a (seldom-used) solution; but methods commonly-used in applied research have severe drawbacks. In this paper, building upon methods that are relatively unknown outside of the robust statistics literature, we provide an enhanced tool for robust estimates of mean and co-variance, useful both for robust estimation and for detection of unusual data points. It is relatively fast and useful for large data sets. We also provide a new robust cluster method, an input to our broader method, but also useful for standalone UDP detection or cluster analysis. We provide a comparative study of numerous methods that is not available in the current literature. Testing indicates that our method performs at par with, and often better than, two of the currently best available methods. We also demonstrate that the issues we discuss are not merely hypothetical, by applying our tools to real world data, and to re-examine two prominent economic studies. Our methods reveal that their central results are driven by a set of unusual points. (C) 2021 University of Venice. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:176 / 202
页数:27
相关论文
共 68 条
[1]  
[Anonymous], 2002, KNOWL INF SYST, DOI DOI 10.1007/S101150200013
[2]  
[Anonymous], 2004, SPR S STAT
[3]  
[Anonymous], 2010, Robust multivariate location and dispersion
[4]  
Bansal N., 2013, INT J INNOV RES COMP, V1, P193
[5]   EQUIVARIANT, MONOTONIC, 50-PERCENT BREAKDOWN ESTIMATORS [J].
BASSETT, GW .
AMERICAN STATISTICIAN, 1991, 45 (02) :135-137
[6]  
Bhaduri K., 2011, ALGORITHMS SPE UNPUB, DOI [10.1145/ 2020408.2020554, DOI 10.1145/2020408.2020554]
[7]   BACON: blocked adaptive computationally efficient outlier nominators [J].
Billor, N ;
Hadi, AS ;
Velleman, PF .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2000, 34 (03) :279-298
[8]  
Blankmeyer E., 2016, ROBUST REGRESS UNPUB, DOI [10.2139/ssrn.2273737, DOI 10.2139/SSRN.2273737]
[9]  
BRYAN MF, 1994, STUD BUS CYCLES, V29, P195
[10]  
Buxton L.H.D., 1920, J R ANTHROPOL INST G, V50, P183, DOI [10.2307/2843379, DOI 10.2307/2843379]