Tuning parameter calibration for l1-regularized logistic regression

被引:10
作者
Li, Wei [1 ]
Lederer, Johannes [2 ,3 ]
机构
[1] Peking Univ, Sch Math Sci, Beijing, Peoples R China
[2] Univ Washington, Dept Stat, Seattle, WA 98195 USA
[3] Univ Washington, Dept Biostat, Seattle, WA 98195 USA
关键词
Feature selection; Penalized logistic regression; Tuning parameter calibration; VARIABLE SELECTION; MODEL SELECTION; LASSO; CLASSIFICATION; PREDICTION;
D O I
10.1016/j.jspi.2019.01.006
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Feature selection is a standard approach to understanding and modeling high-dimensional classification data, but the corresponding statistical methods hinge on tuning parameters that are difficult to calibrate. In particular, existing calibration schemes in the logistic regression framework lack any finite sample guarantees. In this paper, we introduce a novel calibration scheme for l(1)-penalized logistic regression. It is based on simple tests along the tuning parameter path and is equipped with optimal guarantees for feature selection. It is also amenable to easy and efficient implementations, and it rivals or outmatches existing methods in simulations and real data applications. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:80 / 98
页数:19
相关论文
共 34 条
[1]  
[Anonymous], 1973, P 2 INT S INF THEOR, DOI [10.1007/978-1-4612-1694-0, 10.1007/978-1-4612-0919-5_38]
[2]  
Bühlmann P, 2011, SPRINGER SER STAT, P1, DOI 10.1007/978-3-642-20192-9
[3]   Honest variable selection in linear and logistic regression models via l1 and l1 + l2 penalization [J].
Bunea, Florentina .
ELECTRONIC JOURNAL OF STATISTICS, 2008, 2 :1153-1194
[4]   EXTENDED BIC FOR SMALL-n-LARGE-P SPARSE GLM [J].
Chen, Jiahua ;
Chen, Zehua .
STATISTICA SINICA, 2012, 22 (02) :555-574
[5]  
Chichignoud M, 2016, J MACH LEARN RES, V17
[6]   On the prediction performance of the Lasso [J].
Dalalyan, Arnak S. ;
Hebiri, Mohamed ;
Lederer, Johannes .
BERNOULLI, 2017, 23 (01) :552-581
[7]   Comparison of discrimination methods for the classification of tumors using gene expression data [J].
Dudoit, S ;
Fridlyand, J ;
Speed, TP .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2002, 97 (457) :77-87
[8]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[9]   Tuning parameter selection in high dimensional penalized likelihood [J].
Fan, Yingying ;
Tang, Cheng Yong .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2013, 75 (03) :531-552
[10]  
Friedman J., 2016, glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models