Constrained class-wise feature selection (CCFS)

被引:2
|
作者
Hussain, Syed Fawad [1 ,2 ]
Shahzadi, Fatima [1 ,2 ]
Munir, Badre [1 ]
机构
[1] GIK Inst Engn Sci & Technol, Topi 23460, Khyber Pakhtunk, Pakistan
[2] GIK Inst, Machine Learning & Data Sci Lab MDS, Topi, Pakistan
关键词
Feature selection; Information theory; Classification; Class-wise feature selection; MUTUAL INFORMATION; TEXT CLASSIFICATION; MACHINE;
D O I
10.1007/s13042-022-01589-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection plays a vital role as a preprocessing step for high dimensional data in machine learning. The basic purpose of feature selection is to avoid "curse of dimensionality" and reduce time and space complexity of training data. Several techniques, including those that use information theory, have been proposed in the literature as a means to measure the information content of a feature. Most of them incrementally select features with max dependency with the category but minimum redundancy with already selected features. A key missing idea in these techniques is the fair representation of features with max dependency among the different categories, i.e., skewed selection of features having high mutual information (MI) with a particular class. This can result in a biased classification in favor of that particular class while other classes have low matching scores during classification. We propose a novel approach based on information theory that selects features in a class-wise fashion rather than based on their global max dependency. In addition, a constrained search is used instead of a global sequential forward search. We prove that our proposed approach enhances Maximum Relevance while keeping Minimum Redundancy under a constrained search. Results on multiple benchmark datasets show that our proposed method improves accuracy as compared to other state-of-the-art feature selection algorithms while having a lower time complexity.
引用
收藏
页码:3211 / 3224
页数:14
相关论文
共 50 条
  • [21] Feature selection considering two types of feature relevancy and feature interdependency
    Hu, Liang
    Gao, Wanfu
    Zhao, Kuo
    Zhang, Ping
    Wang, Feng
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 93 : 423 - 434
  • [22] An Empirical Evaluation of Constrained Feature Selection
    Bach J.
    Zoller K.
    Trittenbach H.
    Schulz K.
    Böhm K.
    SN Computer Science, 3 (6)
  • [23] Feature selection by integrating two groups of feature evaluation criteria
    Gao, Wanfu
    Hu, Liang
    Zhang, Ping
    Wang, Feng
    EXPERT SYSTEMS WITH APPLICATIONS, 2018, 110 : 11 - 19
  • [24] CCFS: A Confidence-Based Cost-Effective Feature Selection Scheme for Healthcare Data Classification
    Chen, Yiyuan
    Wang, Yufeng
    Cao, Liang
    Jin, Qun
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (03) : 902 - 911
  • [25] Feature selection for classification with class-separability strategy and data envelopment analysis
    Zhang, Yishi
    Yang, Chao
    Yang, Anrong
    Xiong, Chan
    Zhou, Xingchi
    Zhang, Zigang
    NEUROCOMPUTING, 2015, 166 : 172 - 184
  • [26] Feature selection method with joint maximal information entropy between features and class
    Zheng, Kangfeng
    Wang, Xiujuan
    PATTERN RECOGNITION, 2018, 77 : 20 - 29
  • [27] Mutual information based multi-label feature selection via constrained convex optimization
    Sun, Zhenqiang
    Zhang, Jia
    Dai, Liang
    Li, Candong
    Zhou, Changen
    Xin, Jiliang
    Li, Shaozi
    NEUROCOMPUTING, 2019, 329 : 447 - 456
  • [28] An Evaluation of Feature Selection Robustness on Class Noisy Data
    Pau, Simone
    Perniciano, Alessandra
    Pes, Barbara
    Rubattu, Dario
    Jia, Heming
    INFORMATION, 2023, 14 (08)
  • [29] Feature selection with multi-class logistic regression
    Wang, Jingyu
    Wang, Hongmei
    Nie, Feiping
    Li, Xuelong
    NEUROCOMPUTING, 2023, 543
  • [30] Feature selection considering weighted relevancy
    Ping Zhang
    Wanfu Gao
    Guixia Liu
    Applied Intelligence, 2018, 48 : 4615 - 4625