A novel embedded min-max approach for feature selection in nonlinear Support Vector Machine classification

被引:59
作者
Jimenez-Cordero, Asuncion [1 ]
Miguel Morales, Juan [1 ]
Pineda, Salvador [1 ]
机构
[1] Univ Malaga, OASYS Grp, Malaga, Spain
基金
欧洲研究理事会;
关键词
Machine learning; Min-max optimization; Duality theory; Feature selection; Nonlinear Support Vector Machine classification; VARIABLE SELECTION; KERNEL;
D O I
10.1016/j.ejor.2020.12.009
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
In recent years, feature selection has become a challenging problem in several machine learning fields, such as classification problems. Support Vector Machine (SVM) is a well-known technique applied in classification tasks. Various methodologies have been proposed in the literature to select the most relevant features in SVM. Unfortunately, all of them either deal with the feature selection problem in the linear classification setting or propose ad-hoc approaches that are difficult to implement in practice. In contrast, we propose an embedded feature selection method based on a min-max optimization problem, where a trade-off between model complexity and classification accuracy is sought. By leveraging duality theory, we equivalently reformulate the min-max problem and solve it without further ado using off-the-shelf software for nonlinear optimization. The efficiency and usefulness of our approach are tested on several benchmark data sets in terms of accuracy, number of selected features and interpretability. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:24 / 35
页数:12
相关论文
共 38 条
  • [11] Simultaneously Removing Noise and Selecting Relevant Features for High Dimensional Noisy Data
    Byeon, Boseon
    Rasheed, Khaled
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS, PROCEEDINGS, 2008, : 147 - 152
  • [12] A survey on feature selection methods
    Chandrashekar, Girish
    Sahin, Ferat
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) : 16 - 28
  • [13] High dimensional data classification and feature selection using support vector machines
    Ghaddar, Bissan
    Naoum-Sawaya, Joe
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2018, 265 (03) : 993 - 1004
  • [14] Medical data mining by fuzzy modeling with selected features
    Ghazavi, Sean N.
    Liao, Thunshun W.
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2008, 43 (03) : 195 - 206
  • [15] Kadota K., 2003, Chem-Bio Informatics Journal, V3, P30
  • [16] Kotsiantis SB, 2007, INFORM-J COMPUT INFO, V31, P249
  • [17] Classification model selection via bilevel programming
    Kunapuli, G.
    Bennett, K. P.
    Hu, Jing
    Pang, Jong-Shi
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2008, 23 (04) : 475 - 489
  • [18] Mixed integer linear programming for feature selection in support vector machine
    Labbe, Martine
    Martinez-Merino, Luisa I.
    Rodriguez-Chia, Antonio M.
    [J]. DISCRETE APPLIED MATHEMATICS, 2019, 261 : 276 - 304
  • [19] Kernel-based calibration methods combined with multivariate feature selection to improve accuracy of near-infrared spectroscopic analysis
    Lee, Junghye
    Chang, Kyeol
    Jun, Chi-Hyuck
    Cho, Rae-Kwang
    Chung, Hoeil
    Lee, Hyeseon
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2015, 147 : 139 - 146
  • [20] Key quality characteristics selection for imbalanced production data using a two-phase bi-objective feature selection method
    Li, An-Da
    He, Zhen
    Wang, Qing
    Zhang, Yang
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2019, 274 (03) : 978 - 989