Error rates for multivariate outlier detection

被引:36
作者
Cerioli, Andrea [1 ]
Farcomeni, Alessio [2 ]
机构
[1] Univ Parma, I-43100 Parma, Italy
[2] Sapienza Univ Rome, I-00186 Rome, Italy
关键词
False discovery rate; False discovery exceedance; Multiple outliers; Reweighted MCD; Masking and swamping; FALSE DISCOVERY RATE; IDENTIFICATION; TESTS;
D O I
10.1016/j.csda.2010.05.021
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multivariate outlier identification requires the choice of reliable cut-off points for the robust distances that measure the discrepancy from the fit provided by high-breakdown estimators of location and scatter. Multiplicity issues affect the identification of the appropriate cut-off points. It is described how a careful choice of the error rate which is controlled during the outlier detection process can yield a good compromise between high power and low swamping, when alternatives to the Family Wise Error Rate are considered. Multivariate outlier detection rules based on the False Discovery Rate and the False Discovery Exceedance criteria are proposed. The properties of these rules are evaluated through simulation. The rules are then applied to real data examples. The conclusion is that the proposed approach provides a sensible strategy in many situations of practical interest. (C) 2010 Elsevier B.V. All rights reserved.
引用
收藏
页码:544 / 553
页数:10
相关论文
共 29 条
[1]  
[Anonymous], 2004, Statistical Applications in Genetics and Molecular Biology, DOI 10.2202/1544-6115.1042
[2]  
Becker C, 1999, J AM STAT ASSOC, V94, P947
[3]  
Benjamini Y, 2001, ANN STAT, V29, P1165
[4]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[5]  
CERIOLI A, 2010, DIAGNOSTIC CHECKING
[7]   Controlling the size of multivariate outlier tests with the MCD estimator of scatter [J].
Cerioli, Andrea ;
Riani, Marco ;
Atkinson, Anthony C. .
STATISTICS AND COMPUTING, 2009, 19 (03) :341-353
[8]   Influence functions of the Spearman and Kendall correlation measures [J].
Croux, Christophe ;
Dehon, Catherine .
STATISTICAL METHODS AND APPLICATIONS, 2010, 19 (04) :497-515
[9]   THE IDENTIFICATION OF MULTIPLE OUTLIERS [J].
DAVIES, L ;
GATHER, U .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1993, 88 (423) :782-792
[10]   A review of modern multiple hypothesis testing, with particular attention to the false discovery proportion [J].
Farcomeni, Alessio .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2008, 17 (04) :347-388