PAPER Mixed-Integer Linear Optimization Formulations for Feature Subset Selection in Kernel SVM Classification

被引:0
作者
Tamura, Ryuta [1 ,2 ]
Takano, Yuichi [3 ]
Miyashiro, Ryuhei [4 ]
机构
[1] Tokyo Univ Agr & Technol, Grad Sch Engn, Koganei, Tokyo 1848588, Japan
[2] October Sky Co Ltd, Fuchu, Tokyo 1830055, Japan
[3] Univ Tsukuba, Inst Syst & Informat Engn, Tsukuba, Ibaraki 3058573, Japan
[4] Tokyo Univ Agr & Technol, Inst Engn, Koganei 1848588, Japan
关键词
key feature subset selection; support vector machine; mixed-integer optimization; kernel-target alignment; machine learning; LOGISTIC-REGRESSION; PARAMETERS;
D O I
10.1587/transfun.2023EAP1043
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification. To measure the performance of subset selection, we use the distance between two classes (DBTC) in a high- dimensional feature space based on the Gaussian kernel function. However, DBTC to be maximized as an objective function is nonlinear, nonconvex and nonconcave. Despite the difficulty of linearizing such a nonlinear function in general, our major contribution is to propose a mixed-integer linear optimization (MILO) formulation to maximize DBTC for feature subset selection, and this MILO problem can be solved to optimality using optimization software. We also derive a reduced version of the MILO problem to accelerate our MILO computations. Experimental results show good computational efficiency for our MILO formulation with the reduced problem. Moreover, our method can often outperform the linear-SVMbased MILO formulation and recursive feature elimination in prediction performance, especially when there are relatively few data instances.
引用
收藏
页码:1151 / 1162
页数:12
相关论文
共 73 条
  • [1] Aizerman AM, 1964, Avtomat. i Telemeh, V25, P917
  • [2] [Anonymous], 1981, Mathematical Programming in Statis tics
  • [3] Feature selection for support vector machines using Generalized Benders Decomposition
    Aytug, Haldun
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 244 (01) : 210 - 218
  • [4] Certifiably optimal sparse principal component analysis
    Berk, Lauren
    Bertsimasi, Dimitris
    [J]. MATHEMATICAL PROGRAMMING COMPUTATION, 2019, 11 (03) : 381 - 420
  • [5] Sparse classification: a scalable discrete optimization perspective
    Bertsimas, Dimitris
    Pauphilet, Jean
    Van Parys, Bart
    [J]. MACHINE LEARNING, 2021, 110 (11-12) : 3177 - 3209
  • [6] Bertsimas D, 2020, STAT SCI, V35, P555, DOI 10.1214/19-STS701
  • [7] Scalable holistic linear regression
    Bertsimas, Dimitris
    Li, Michael Lingzhi
    [J]. OPERATIONS RESEARCH LETTERS, 2020, 48 (03) : 203 - 208
  • [8] Logistic Regression: From Art to Science
    Bertsimas, Dimitris
    King, Angela
    [J]. STATISTICAL SCIENCE, 2017, 32 (03) : 367 - 384
  • [9] OR Forum-An Algorithmic Approach to Linear Regression
    Bertsimas, Dimitris
    King, Angela
    [J]. OPERATIONS RESEARCH, 2016, 64 (01) : 2 - 16
  • [10] BEST SUBSET SELECTION VIA A MODERN OPTIMIZATION LENS
    Bertsimas, Dimitris
    King, Angela
    Mazumder, Rahul
    [J]. ANNALS OF STATISTICS, 2016, 44 (02) : 813 - 852