PAPER Mixed-Integer Linear Optimization Formulations for Feature Subset Selection in Kernel SVM Classification

被引：0

作者：

Tamura, Ryuta ^{[1
,2
]}

Takano, Yuichi ^{[3
]}

Miyashiro, Ryuhei ^{[4
]}

机构：

[1] Tokyo Univ Agr & Technol, Grad Sch Engn, Koganei, Tokyo 1848588, Japan

[2] October Sky Co Ltd, Fuchu, Tokyo 1830055, Japan

[3] Univ Tsukuba, Inst Syst & Informat Engn, Tsukuba, Ibaraki 3058573, Japan

[4] Tokyo Univ Agr & Technol, Inst Engn, Koganei 1848588, Japan

来源：

IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES | 2024年 / E107A卷 / 08期

关键词：

key feature subset selection; support vector machine; mixed-integer optimization; kernel-target alignment; machine learning; LOGISTIC-REGRESSION; PARAMETERS;

D O I：

10.1587/transfun.2023EAP1043

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

We study the mixed-integer optimization (MIO) approach to feature subset selection in nonlinear kernel support vector machines (SVMs) for binary classification. To measure the performance of subset selection, we use the distance between two classes (DBTC) in a high- dimensional feature space based on the Gaussian kernel function. However, DBTC to be maximized as an objective function is nonlinear, nonconvex and nonconcave. Despite the difficulty of linearizing such a nonlinear function in general, our major contribution is to propose a mixed-integer linear optimization (MILO) formulation to maximize DBTC for feature subset selection, and this MILO problem can be solved to optimality using optimization software. We also derive a reduced version of the MILO problem to accelerate our MILO computations. Experimental results show good computational efficiency for our MILO formulation with the reduced problem. Moreover, our method can often outperform the linear-SVMbased MILO formulation and recursive feature elimination in prediction performance, especially when there are relatively few data instances.

引用

页码：1151 / 1162

页数：12

共 73 条

[1] Aizerman AM, 1964, Avtomat. i Telemeh, V25, P917
[2] [Anonymous], 1981, Mathematical Programming in Statis tics
[3] Feature selection for support vector machines using Generalized Benders Decomposition
Aytug, Haldun
[J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2015, 244 (01) : 210 - 218
[4] Certifiably optimal sparse principal component analysis
Berk, Lauren
Bertsimasi, Dimitris
[J]. MATHEMATICAL PROGRAMMING COMPUTATION, 2019, 11 (03) : 381 - 420
[5] Sparse classification: a scalable discrete optimization perspective
Bertsimas, Dimitris
Pauphilet, Jean
Van Parys, Bart
[J]. MACHINE LEARNING, 2021, 110 (11-12) : 3177 - 3209
[6] Bertsimas D, 2020, STAT SCI, V35, P555, DOI 10.1214/19-STS701
[7] Scalable holistic linear regression
Bertsimas, Dimitris
Li, Michael Lingzhi
[J]. OPERATIONS RESEARCH LETTERS, 2020, 48 (03) : 203 - 208
[8] Logistic Regression: From Art to Science
Bertsimas, Dimitris
King, Angela
[J]. STATISTICAL SCIENCE, 2017, 32 (03) : 367 - 384
[9] OR Forum-An Algorithmic Approach to Linear Regression
Bertsimas, Dimitris
King, Angela
[J]. OPERATIONS RESEARCH, 2016, 64 (01) : 2 - 16
[10] BEST SUBSET SELECTION VIA A MODERN OPTIMIZATION LENS
Bertsimas, Dimitris
King, Angela
Mazumder, Rahul
[J]. ANNALS OF STATISTICS, 2016, 44 (02) : 813 - 852

← 1 2 3 4 5 6 7 8 →