Sparse Poisson regression via mixed-integer optimization

被引:4
作者
Saishu, Hiroki [1 ]
Kudo, Kota [1 ]
Takano, Yuichi [2 ]
机构
[1] Univ Tsukuba, Grad Sch Sci & Technol, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Fac Engn Informat & Syst, Tsukuba, Ibaraki, Japan
关键词
FEATURE SUBSET-SELECTION; VARIABLE SELECTION; LOGISTIC-REGRESSION; FORMULATIONS; MODELS;
D O I
10.1371/journal.pone.0249916
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We present a mixed-integer optimization (MIO) approach to sparse Poisson regression. The MIO approach to sparse linear regression was first proposed in the 1970s, but has recently received renewed attention due to advances in optimization algorithms and computer hardware. In contrast to many sparse estimation algorithms, the MIO approach has the advantage of finding the best subset of explanatory variables with respect to various criterion functions. In this paper, we focus on a sparse Poisson regression that maximizes the weighted sum of the log-likelihood function and the L-2-regularization term. For this problem, we derive a mixed-integer quadratic optimization (MIQO) formulation by applying a piecewise-linear approximation to the log-likelihood function. Optimization software can solve this MIQO problem to optimality. Moreover, we propose two methods for selecting a limited number of tangent lines effective for piecewise-linear approximations. We assess the efficacy of our method through computational experiments using synthetic and real-world datasets. Our methods provide better log-likelihood values than do conventional greedy algorithms in selecting tangent lines. In addition, our MIQO formulation delivers better out-of-sample prediction performance than do forward stepwise selection and L-1-regularized estimation, especially in low-noise situations.
引用
收藏
页数:17
相关论文
共 50 条
[1]  
Algamal Z., 2019, Statistics, Optimization Information Computing, V7, P520
[2]  
Arthanari T. S., 1981, Mathematical programming in statistics
[3]  
Bertsimas D, 2020, STAT SCI, V35, P555, DOI 10.1214/19-STS701
[4]   Scalable holistic linear regression [J].
Bertsimas, Dimitris ;
Li, Michael Lingzhi .
OPERATIONS RESEARCH LETTERS, 2020, 48 (03) :203-208
[5]   Logistic Regression: From Art to Science [J].
Bertsimas, Dimitris ;
King, Angela .
STATISTICAL SCIENCE, 2017, 32 (03) :367-384
[6]   OR Forum-An Algorithmic Approach to Linear Regression [J].
Bertsimas, Dimitris ;
King, Angela .
OPERATIONS RESEARCH, 2016, 64 (01) :2-16
[7]   BEST SUBSET SELECTION VIA A MODERN OPTIMIZATION LENS [J].
Bertsimas, Dimitris ;
King, Angela ;
Mazumder, Rahul .
ANNALS OF STATISTICS, 2016, 44 (02) :813-852
[8]  
Cameron AC, 2013, REGRESSION ANAL COUN
[9]   Bayesian Poisson Regression for Crowd Counting [J].
Chan, Antoni B. ;
Vasconcelos, Nuno .
2009 IEEE 12TH INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2009, :545-551
[10]   A survey on feature selection methods [J].
Chandrashekar, Girish ;
Sahin, Ferat .
COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) :16-28