SLOPE MEETS LASSO: IMPROVED ORACLE BOUNDS AND OPTIMALITY

被引:100
作者
Bellec, Pierre C. [1 ,2 ,4 ]
Lecue, Guillaume [1 ,2 ,3 ]
Tsybakov, Alexandre B. [1 ,2 ]
机构
[1] ENSAE, 3 Ave Pierre Larousse, F-92240 Malakoff, France
[2] CREST UMR CNRS 9194, Palaiseau, France
[3] CNRS, Paris, France
[4] Rutgers State Univ, Dept Stat & Biostat, Busch Campus, Piscataway, NJ 08854 USA
关键词
Sparse linear regression; minimax rates; high-dimensional statistics; Slope; Lasso; DANTZIG SELECTOR; OPTIMAL RATES; MINIMAX; REGRESSION; RECOVERY; SPARSITY;
D O I
10.1214/17-AOS1670
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We show that two polynomial time methods, a Lasso estimator with adaptively chosen tuning parameter and a Slope estimator, adaptively achieve the minimax prediction and l(2) estimation rate (s/n)log(p/s) in high-dimensional linear regression on the class of s-sparse vectors in R-P. This is done under the Restricted Eigenvalue (RE) condition for the Lasso and under a slightly more constraining assumption on the design for the Slope. The main results have the form of sharp oracle inequalities accounting for the model misspecification error. The minimax optimal bounds are also obtained for the l(q) estimation errors with 1 <= q <= 2 when the model is well specified. The results are nonasymptotic, and hold both in probability and in expectation. The assumptions that we impose on the design are satisfied with high probability for a large class of random matrices with independent and possibly anisotropically distributed rows. We give a comparative analysis of conditions, under which oracle bounds for the Lasso and Slope estimators can be obtained. In particular, we show that several known conditions, such as the RE condition and the sparse eigenvalue condition are equivalent if the l(2)-norms of regressors are uniformly bounded.
引用
收藏
页码:3603 / 3642
页数:40
相关论文
共 29 条
[1]   MAP model selection in Gaussian regression [J].
Abramovich, Felix ;
Grinshtein, Vadim .
ELECTRONIC JOURNAL OF STATISTICS, 2010, 4 :932-949
[2]  
[Anonymous], 2015, MONOGRAPHS STAT APPL
[3]  
[Anonymous], 2014, C LEARNING THEORY
[4]  
[Anonymous], 1964, NBS APPL MATH SERIES
[5]   OPTIMAL DETECTION OF SPARSE PRINCIPAL COMPONENTS IN HIGH DIMENSION [J].
Berthet, Quentin ;
Rigollet, Philippe .
ANNALS OF STATISTICS, 2013, 41 (04) :1780-1815
[6]   SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR [J].
Bickel, Peter J. ;
Ritov, Ya'acov ;
Tsybakov, Alexandre B. .
ANNALS OF STATISTICS, 2009, 37 (04) :1705-1732
[7]   SLOPE-ADAPTIVE VARIABLE SELECTION VIA CONVEX OPTIMIZATION [J].
Bogdan, Malgorzata ;
van den Berg, Ewout ;
Sabatti, Chiara ;
Su, Weijie ;
Candes, Emmanuel J. .
ANNALS OF APPLIED STATISTICS, 2015, 9 (03) :1103-1140
[8]  
Boucheron S., 2013, Concentration inequalities. A nonasymptotic theory of independence, DOI DOI 10.1093/ACPROF:OSO/9780199535255.001.0001
[9]   Near-optimal signal recovery from random projections: Universal encoding strategies? [J].
Candes, Emmanuel J. ;
Tao, Terence .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2006, 52 (12) :5406-5425
[10]   How well can we estimate a sparse vector? [J].
Candes, Emmanuel J. ;
Davenport, Mark A. .
APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2013, 34 (02) :317-323