One-Step Estimator Paths for Concave Regularization

被引:12
作者
Taddy, Matt [1 ,2 ]
机构
[1] Microsoft Res New England, Chicago, IL 60601 USA
[2] Univ Chicago, Booth Sch Business, Chicago, IL 60637 USA
关键词
Sparse regression; High dimensional statistics; Massive datasets; Bayesian regression; Penalized estimation; NONCONCAVE PENALIZED LIKELIHOOD; MULTISTAGE CONVEX RELAXATION; VARIABLE SELECTION; COORDINATE DESCENT; ORACLE PROPERTIES; ADAPTIVE LASSO; REGRESSION; MODELS; SHRINKAGE; DIMENSION;
D O I
10.1080/10618600.2016.1211532
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The statistics literature of the past 15 years has established many favorable properties for sparse diminishing-bias regularization: techniques that can roughly be understood as providing estimation under penalty functions spanning the range of concavity between l(0) and l(1) norms. However, lasso l(1)-regularized estimation remains the standard tool for industrial Big Data applications because of its minimal computational cost and the presence of easy-to-apply rules for penalty selection. In response, this article proposes a simple new algorithm framework that requires no more computation than a lasso path: the path of one-step estimators (POSE) does l(1) penalized regression estimation on a grid of decreasing penalties, but adapts coefficient-specific weights to decrease as a function of the coefficient estimated in the previous path step. This provides sparse diminishing-bias regularization at no extra cost over the fastest lasso algorithms. Moreover, our gamma lasso implementation of POSE is accompanied by a reliable heuristic for the fit degrees of freedom, so that standard information criteria can be applied in penalty selection. We also provide novel results on the distance between weighted-l(1) and l(0) penalized predictors; this allows us to build intuition about POSE and other diminishing-bias regularization schemes. The methods and results are illustrated in extensive simulations and in application of logistic regression to evaluating the performance of hockey players. Supplementary materials for this article are available online.
引用
收藏
页码:525 / 536
页数:12
相关论文
共 42 条
[1]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267
[2]  
[Anonymous], 22423 NBER
[3]   GENERALIZED DOUBLE PARETO SHRINKAGE [J].
Armagan, Artin ;
Dunson, David B. ;
Lee, Jaeyong .
STATISTICA SINICA, 2013, 23 (01) :119-143
[4]   SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR [J].
Bickel, Peter J. ;
Ritov, Ya'acov ;
Tsybakov, Alexandre B. .
ANNALS OF STATISTICS, 2009, 37 (04) :1705-1732
[5]   ONE-STEP HUBER ESTIMATES IN LINEAR-MODEL [J].
BICKEL, PJ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1975, 70 (350) :428-434
[6]   COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION [J].
Breheny, Patrick ;
Huang, Jian .
ANNALS OF APPLIED STATISTICS, 2011, 5 (01) :232-253
[7]  
Breiman L, 1996, ANN STAT, V24, P2350
[8]  
Bühlmann P, 2011, SPRINGER SER STAT, P1, DOI 10.1007/978-3-642-20192-9
[9]   Enhancing Sparsity by Reweighted l1 Minimization [J].
Candes, Emmanuel J. ;
Wakin, Michael B. ;
Boyd, Stephen P. .
JOURNAL OF FOURIER ANALYSIS AND APPLICATIONS, 2008, 14 (5-6) :877-905
[10]  
Cevher V., 2009, 2009 NEURAL INFORM P