A general decision theory for Huber's ε-contamination model

被引:50
作者
Chen, Mengjie [1 ]
Gao, Chao [2 ]
Ren, Zhao [3 ]
机构
[1] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[2] Univ Chicago, Dept Stat, Chicago, IL 60637 USA
[3] Univ Pittsburgh, Dept Stat, Pittsburgh, PA 15260 USA
关键词
Robust statistics; robust testing; minimax rate; density estimation; sparse linear regression; trace regression; ASYMPTOTIC EQUIVALENCE; MINIMAX RATES; WHITE-NOISE; CONVERGENCE; REGRESSION; RECOVERY; TESTS;
D O I
10.1214/16-EJS1216
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Today's data pose unprecedented challenges to statisticians. It may be incomplete, corrupted or exposed to some unknown source of contamination. We need new methods and theories to grapple with these challenges. Robust estimation is one of the revived fields with potential to accommodate such complexity and glean useful information from modern datasets. Following our recent work on high dimensional robust covariance matrix estimation, we establish a general decision theory for robust statistics under Huber's epsilon-contamination model. We propose a solution using Scheffe estimate to a robust two-point testing problem that leads to the construction of robust estimators adaptive to the proportion of contamination. Applying the general theory, we construct robust estimators for nonparametric density estimation, sparse linear regression and low-rank trace regression. We show that these new estimators achieve the minimax rate with optimal dependence on the contamination proportion. This testing procedure, Scheffe estimate, also enjoys an optimal rate in the exponent of the testing error, which may be of independent interest.
引用
收藏
页码:3752 / 3774
页数:23
相关论文
共 25 条
[1]  
[Anonymous], 2008, INTRO NONPARAMETRIC
[2]  
Birge L., 1984, PROBAB MATH STAT, V3, P259
[3]  
Brown LD, 1996, ANN STAT, V24, P2384
[4]   Tight Oracle Inequalities for Low-Rank Matrix Recovery From a Minimal Number of Noisy Random Measurements [J].
Candes, Emmanuel J. ;
Plan, Yaniv .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2011, 57 (04) :2342-2359
[5]  
Chen Mengjie, 2015, ARXIV150600691
[6]  
Cohen A., 1993, Applied and Computational Harmonic Analysis, V1, P54, DOI 10.1006/acha.1993.1005
[7]  
Devroye Luc, 2012, Combinatorial methods in density estimation
[8]  
DONOHO DL, 1991, ANN STAT, V19, P633, DOI 10.1214/aos/1176348114
[9]   STATISTICAL ESTIMATION AND OPTIMAL RECOVERY [J].
DONOHO, DL .
ANNALS OF STATISTICS, 1994, 22 (01) :238-270
[10]   ON ADAPTIVE POSTERIOR CONCENTRATION RATES [J].
Hoffmann, Marc ;
Rousseau, Judith ;
Schmidt-Hieber, Johannes .
ANNALS OF STATISTICS, 2015, 43 (05) :2259-2295