Linear Cost-sensitive Max-margin Embedded Feature Selection for SVM

被引:15
作者
Aram, Khalid Y. [1 ]
Lam, Sarah S. [2 ]
Khasawneh, Mohammad T. [2 ]
机构
[1] Emporia State Univ, Dept Business Adm, Emporia, KS 66801 USA
[2] SUNY Binghamton, Dept Syst Sci & Ind Engn, Binghamton, NY 13902 USA
关键词
Classification; Cost-sensitive learning; Feature selection; Mathematical programming; Support vector machines; VECTOR; CLASSIFICATION; MACHINE; CANCER;
D O I
10.1016/j.eswa.2022.116683
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The information needed for a certain machine application can be often obtained from a subset of the available features. Strongly relevant features should be retained to achieve desirable model performance. This research focuses on selecting relevant independent features for Support Vector Machine (SVM) classifiers in a cost-sensitive manner. A review of recent literature about feature selection for SVM revealed a lack of linear pro-gramming embedded SVM feature selection models. Most reviewed models were mixed-integer linear or nonlinear. Further, the review highlighted a lack of cost-sensitive SVM feature selection models. Cost sensitivity improves the generalization of SVM feature selection models, making them applicable to various cost-of-error situations. It also helps with handling imbalanced data. This research introduces an SVM-based filter method named Knapsack Max-Margin Feature Selection (KS-MMFS), which is a proposed linearization of the quadratic Max-Margin Feature Selection (MMFS) model. MMFS provides explicit estimates of feature importance in terms of relevance and redundancy. KS-MMFS was then used to develop a linear cost-sensitive SVM embedded feature selection model. The proposed model was tested on a group of 11 benchmark datasets and compared to relevant models from the literature. The results and analysis showed that different cost sensitivity (i.e., sensitivity-spe-cificity tradeoff) requirements influence the features selected. The analysis demonstrated the competitive per-formance of the proposed model compared with relevant models. The model achieved an average improvement of 31.8% on classification performance with a 22.4% average reduction in solution time. The results and analysis in this research demonstrated the competitive performance of the proposed model as an efficient cost-sensitive embedded feature selection method.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Cost-sensitive SVDD models based on a sample selection approach
    Zhao, Zhenchong
    Wang, Xiaodan
    [J]. APPLIED INTELLIGENCE, 2018, 48 (11) : 4247 - 4266
  • [22] Large margin cost-sensitive learning of conditional random fields
    Kim, Minyoung
    [J]. PATTERN RECOGNITION, 2010, 43 (10) : 3683 - 3692
  • [23] Cost-Sensitive Ensemble Feature Ranking and Automatic Threshold Selection for Chronic Kidney Disease Diagnosis
    Imran Ali, Syed
    Ali, Bilal
    Hussain, Jamil
    Hussain, Musarrat
    Satti, Fahad Ahmed
    Park, Gwang Hoon
    Lee, Sungyoung
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (16):
  • [24] Cost-sensitive feature selection via the l2,1-norm
    Zhao, Hong
    Yu, Shenglong
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2019, 104 : 25 - 37
  • [25] Development and Evaluation of Cost-Sensitive Universum-SVM
    Dhar, Sauptik
    Cherkassky, Vladimir
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2015, 45 (04) : 806 - 818
  • [26] Feature selection for linear SVM with provable guarantees
    Paul, Saurabh
    Magdon-Ismail, Malik
    Drineas, Petros
    [J]. PATTERN RECOGNITION, 2016, 60 : 205 - 214
  • [27] An uncertainty-oriented cost-sensitive credit scoring framework with multi-objective feature selection
    Wu, Yiqiong
    Huang, Wei
    Tian, Yingjie
    Zhu, Qing
    Yu, Lean
    [J]. ELECTRONIC COMMERCE RESEARCH AND APPLICATIONS, 2022, 53
  • [28] Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection
    Lee, Sang Min
    Kim, Dong Seong
    Park, Jong Sou
    [J]. JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2011, 17 (06) : 944 - 960
  • [29] COST-SENSITIVE FEATURE SELECTION BASED ON LABEL SIGNIFICANCE AND POSITIVE REGION
    Huang, Jintao
    Qian, Wenbin
    Wu, Binglong
    Wang, Yinglong
    [J]. PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), 2019, : 403 - 409
  • [30] A Cost-Sensitive Feature Selection Method for High-Dimensional Data
    An, Chaojie
    Zhou, Qifeng
    [J]. 14TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND EDUCATION (ICCSE 2019), 2019, : 1089 - 1094