Linear Cost-sensitive Max-margin Embedded Feature Selection for SVM

被引:20
作者
Aram, Khalid Y. [1 ]
Lam, Sarah S. [2 ]
Khasawneh, Mohammad T. [2 ]
机构
[1] Emporia State Univ, Dept Business Adm, Emporia, KS 66801 USA
[2] SUNY Binghamton, Dept Syst Sci & Ind Engn, Binghamton, NY 13902 USA
关键词
Classification; Cost-sensitive learning; Feature selection; Mathematical programming; Support vector machines; VECTOR; CLASSIFICATION; MACHINE; CANCER;
D O I
10.1016/j.eswa.2022.116683
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The information needed for a certain machine application can be often obtained from a subset of the available features. Strongly relevant features should be retained to achieve desirable model performance. This research focuses on selecting relevant independent features for Support Vector Machine (SVM) classifiers in a cost-sensitive manner. A review of recent literature about feature selection for SVM revealed a lack of linear pro-gramming embedded SVM feature selection models. Most reviewed models were mixed-integer linear or nonlinear. Further, the review highlighted a lack of cost-sensitive SVM feature selection models. Cost sensitivity improves the generalization of SVM feature selection models, making them applicable to various cost-of-error situations. It also helps with handling imbalanced data. This research introduces an SVM-based filter method named Knapsack Max-Margin Feature Selection (KS-MMFS), which is a proposed linearization of the quadratic Max-Margin Feature Selection (MMFS) model. MMFS provides explicit estimates of feature importance in terms of relevance and redundancy. KS-MMFS was then used to develop a linear cost-sensitive SVM embedded feature selection model. The proposed model was tested on a group of 11 benchmark datasets and compared to relevant models from the literature. The results and analysis showed that different cost sensitivity (i.e., sensitivity-spe-cificity tradeoff) requirements influence the features selected. The analysis demonstrated the competitive per-formance of the proposed model compared with relevant models. The model achieved an average improvement of 31.8% on classification performance with a 22.4% average reduction in solution time. The results and analysis in this research demonstrated the competitive performance of the proposed model as an efficient cost-sensitive embedded feature selection method.
引用
收藏
页数:11
相关论文
共 50 条
[41]   Cost-sensitive selection of variables by ensemble of model sequences [J].
Yan, Donghui ;
Qin, Zhiwei ;
Gu, Songxiang ;
Xu, Haiping ;
Shao, Ming .
KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (05) :1069-1092
[42]   MULTI-LABEL COST-SENSITIVE FEATURE SELECTION ALGORITHM IN INCOMPLETE DATA [J].
Huang, Qin ;
Qian, Wenbin ;
Shu, Wenhao ;
Wu, Binglong ;
Feng, Shuangshuang .
PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS (ICMLC), VOL 1, 2018, :56-62
[43]   A Novel classifier - Weighted Features Cost-sensitive SVM [J].
Ding, Cheng ;
Wu, Min .
2016 IEEE INTERNATIONAL CONFERENCE ON INTERNET OF THINGS (ITHINGS) AND IEEE GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) AND IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING (CPSCOM) AND IEEE SMART DATA (SMARTDATA), 2016, :598-603
[44]   Integrating Feature Selection and Min-Max Modular SVM for Powerful Ensemble [J].
Li, Yun ;
Feng, Li-Li .
2012 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2012,
[45]   Feature Selection for Linear SVM with Provable Guarantees [J].
Paul, Saurabh ;
Magdon-Ismail, Malik ;
Drineas, Petros .
ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 38, 2015, 38 :735-743
[46]   Cost-sensitive sample shifting in feature space [J].
Zhenchong Zhao ;
Xiaodan Wang ;
Chongming Wu ;
Lei Lei .
Pattern Analysis and Applications, 2020, 23 :1689-1707
[47]   Integration of aggressive bound tightening and Mixed Integer Programming for Cost-sensitive feature selection in medical diagnosis [J].
Abdulla, Mai ;
Khasawneh, Mohammad T. .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 187
[48]   Boosting the Generalized Margin in Cost-Sensitive Multiclass Classification [J].
Wang, Junhui .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2013, 22 (01) :178-192
[49]   A maximum-margin genetic algorithm for misclassification cost minimizing feature selection problem [J].
Pendharkar, Parag C. .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (10) :3918-3925
[50]   A Robust Cost-Sensitive Feature Selection Via Self-Paced Learning Regularization [J].
Yangding Li ;
Chaoqun Ma ;
Yiling Tao ;
Zehui Hu ;
Zidong Su ;
Meiling Liu .
Neural Processing Letters, 2022, 54 :2571-2588