Sparse Partially Linear Additive Models

被引:43
作者
Lou, Yin [1 ]
Bien, Jacob [2 ,3 ]
Caruana, Rich [4 ]
Gehrke, Johannes [1 ]
机构
[1] Cornell Univ, Dept Comp Sci, Ithaca, NY 14850 USA
[2] Cornell Univ, Dept Biol Stat & Computat Biol, Ithaca, NY 14850 USA
[3] Cornell Univ, Dept Stat Sci, Ithaca, NY 14850 USA
[4] Microsoft Corp, Microsoft Res, Redmond, WA 98052 USA
关键词
Classification; Generalized partially linear additive models; Group lasso; Regression; Sparsity; SELECTION; REGRESSION; SHRINKAGE;
D O I
10.1080/10618600.2015.1089775
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The generalized partially linear additive model (GPLAM) is a flexible and interpretable approach to building predictive models. It combines features in an additive manner, allowing each to have either a linear or nonlinear effect on the response. However, the choice of which features to treat as linear or nonlinear is typically assumed known. Thus, to make a GPLAM a viable approach in situations in which little is known a priori about the features, one must overcome two primary model selection challenges: deciding which features to include in the model and determining which of these features to treat nonlinearly. We introduce the sparse partially linear additive model (SPLAM), which combines model fitting and both of these model selection challenges into a single convex optimization problem. SPLAM provides a bridge between the lasso and sparse additive models. Through a statistical oracle inequality and thorough simulation, we demonstrate that SPLAM can outperform other methods across a broad spectrum of statistical regimes, including the high-dimensional (p >> N) setting. We develop efficient algorithms that are applied to real datasets with half a million samples and over 45,000 features with excellent predictive performance. Supplementary materials for this article are available online.
引用
收藏
页码:1026 / 1040
页数:15
相关论文
共 39 条
[1]  
[Anonymous], 2010, Proceedings of the 27th International Conference on International Conference on Machine Learning
[2]  
[Anonymous], MATH PROGRAMMING COM
[3]   A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems [J].
Beck, Amir ;
Teboulle, Marc .
SIAM JOURNAL ON IMAGING SCIENCES, 2009, 2 (01) :183-202
[4]  
Bühlmann P, 2011, SPRINGER SER STAT, P1, DOI 10.1007/978-3-642-20192-9
[5]   Consistent covariate selection and post model selection inference in semiparametric regression [J].
Bunea, F .
ANNALS OF STATISTICS, 2004, 32 (03) :898-927
[6]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[7]   NEAR-IDEAL MODEL SELECTION BY l1 MINIMIZATION [J].
Candes, Emmanuel J. ;
Plan, Yaniv .
ANNALS OF STATISTICS, 2009, 37 (5A) :2145-2177
[8]  
Chatterjee Soumyadeep, 2012, P 2012 SIAM INT C DA, P47, DOI DOI 10.1137/1.9781611972825.5
[9]   Determination of linear components in additive models [J].
Chen, Rong ;
Liang, Hua ;
Wang, Jing .
JOURNAL OF NONPARAMETRIC STATISTICS, 2011, 23 (02) :367-383
[10]  
Chouldechova A., 2015, ARXIV1506038501029