A Mixed-Integer Fractional Optimization Approach to Best Subset Selection

被引:14
作者
Gomez, Andres [1 ]
Prokopyev, Oleg A. [2 ]
机构
[1] Univ Southern Calif, Viterbi Sch Engn, Dept Ind & Syst Engn, Los Angeles, CA 90089 USA
[2] Univ Pittsburgh, Swanson Sch Engn, Dept Ind Engn, Pittsburgh, PA 15261 USA
基金
美国国家科学基金会;
关键词
fractional optimization; conic optimization; sparse regression; information criteria; AKAIKES INFORMATION CRITERION; PROGRAMMING FORMULATIONS; REGRESSION; MINIMIZATION; SHRINKAGE; SPARSITY;
D O I
10.1287/ijoc.2020.1031
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We consider the best subset selection problem in linear regression-that is, finding a parsimonious subset of the regression variables that provides the best fit to the data according to some predefined criterion. We are primarily concerned with alternatives to cross-validation methods that do not require data partitioning and involve a range of information criteria extensively studied in the statistical literature. We show that the problem of interest can be modeled using fractional mixed-integer optimization, which can be tackled by leveraging recent advances in modern optimization solvers. The proposed algorithms involve solving a sequence of mixed-integer quadratic optimization problems (or their convexifications) and can be implemented with off-the-shelf solvers. We report encouraging results in our computational experiments, with respect to both the optimization and statistical performance. Summary of Contribution: This paper considers feature selection problems with information criteria. We show that by adopting a fractional optimization perspective (a wellknown field in nonlinear optimization and operations research), it is possible to leverage recent advances in mixed-integer quadratic optimization technology to tackle traditional statistical problems long considered intractable. We present extensive computational experiments, with both synthetic and real data, illustrating that the new fractional optimization approach is orders of magnitude faster than existing approaches in the literature.
引用
收藏
页码:551 / 565
页数:15
相关论文
共 62 条
[1]   FilMINT: An Outer Approximation-Based Solver for Convex Mixed-Integer Nonlinear Programs [J].
Abhishek, Kumar ;
Leyffer, Sven ;
Linderoth, Jeff .
INFORMS JOURNAL ON COMPUTING, 2010, 22 (04) :555-567
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]  
[Anonymous], 2003, Combinatorial structures and their applications
[4]   Polymatroids and mean-risk minimization in discrete optimization [J].
Atamtuerk, Alper ;
Narayanan, Vishnu .
OPERATIONS RESEARCH LETTERS, 2008, 36 (05) :618-622
[5]  
Atamturk A, 2019, RANK ONE CONVE UNPUB
[6]  
Atamturk A, 2018, SPARSE SMOOTH UNPUB
[7]   Successive Quadratic Upper-Bounding for Discrete Mean-Risk Minimization and Network Interdiction [J].
Atamturk, Alper ;
Deck, Carlos ;
Jeon, Hyemin .
INFORMS JOURNAL ON COMPUTING, 2020, 32 (02) :346-355
[8]   Simplex QP-based methods for minimizing a conic quadratic objective over polyhedra [J].
Atamturk, Alper ;
Gomez, Andres .
MATHEMATICAL PROGRAMMING COMPUTATION, 2019, 11 (02) :311-340
[9]   Lifted polymatroid inequalities for mean-risk optimization with indicator variables [J].
Atamturk, Alper ;
Jeon, Hyemin .
JOURNAL OF GLOBAL OPTIMIZATION, 2019, 73 (04) :677-699
[10]   Strong formulations for quadratic optimization with M-matrices and indicator variables [J].
Atamturk, Alper ;
Gomez, Andres .
MATHEMATICAL PROGRAMMING, 2018, 170 (01) :141-176