Multilabel all-relevant feature selection using lower bounds of conditional mutual information

被引:5
|
作者
Teisseyre, Pawel [1 ,2 ]
Lee, Jaesung [3 ,4 ]
机构
[1] Polish Acad Sci, Inst Comp Sci, Warsaw, Poland
[2] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland
[3] Chung Ang Univ, Dept Artificial Intelligence, Seoul, South Korea
[4] Chung Ang Univ, AI ML Res Innovat Ctr, Seoul, South Korea
关键词
Multilabel data analysis; Feature selection; Information theory; Conditional mutual information; Permutation tests; LABEL FEATURE-SELECTION; EFFICIENT FEATURE-SELECTION; GENE-GENE INTERACTIONS; CLASSIFIER CHAINS; DETECT;
D O I
10.1016/j.eswa.2022.119436
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider a multilabel all-relevant feature selection task which is more general than the classical minimal -optimal subset task. Whereas the goal of the minimal-optimal methods is to find the smallest subset of features allowing accurate prediction of labels, the objective of the all-relevant methods is to identify all the features that are related to the target labels, including strongly and all weakly relevant features. The all-relevant task has received much interest in the fields where discovering the dependency structure between features and target variables is more important than the prediction itself, e.g., in medical and bioinformatics applications. In this paper, we formally describe the all-relevant problem for multi-label classification using an information -theoretic approach. We propose a relevancy score and an efficient method of its calculation based on the lower bounds of conditional mutual information. Another practical issue is how to separate the relevant features from irrelevant ones. To find a threshold, we propose a testing procedure based on a permutation scheme. Finally, empirical evaluation of all-relevant methods requires a specific approach. We consider a large variety of simulated datasets representing different dependency structures and containing various types of interactions. Empirical results on simulated datasets and a large clinical database demonstrate that the proposed method can successfully identify relevant features.
引用
收藏
页数:19
相关论文
共 50 条
  • [11] Feature subset selection with cumulate conditional mutual information minimization
    Zhang, Yishi
    Zhang, Zigang
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (05) : 6078 - 6088
  • [12] Nearest neighbor estimate of conditional mutual information in feature selection
    Tsimpiris, Alkiviadis
    Vlachos, Ioannis
    Kugiumtzis, Dimitris
    EXPERT SYSTEMS WITH APPLICATIONS, 2012, 39 (16) : 12697 - 12708
  • [13] Conditional mutual information based feature selection for classification task
    Novovicova, Jana
    Somol, Petr
    Haindl, Michal
    Pudil, Pavel
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 417 - 426
  • [14] Multilabel Feature Selection Based on Fuzzy Mutual Information and Orthogonal Regression
    Dai, Jianhua
    Liu, Qi
    Chen, Wenxiang
    Zhang, Chucai
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2024, 32 (09) : 5136 - 5148
  • [15] Distributed Selection of Continuous Features in Multilabel Classification Using Mutual Information
    Gonzalez-Lopez, Jorge
    Ventura, Sebastian
    Cano, Alberto
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (07) : 2280 - 2293
  • [16] An Improved Feature Selection Algorithm with Conditional Mutual Information for Classification Problems
    Palanichamy, Jaganathan
    Ramasamy, Kuppuchamy
    2013 INTERNATIONAL CONFERENCE ON HUMAN COMPUTER INTERACTIONS (ICHCI), 2013,
  • [17] Comments on supervised feature selection by clustering using conditional mutual information-based distances
    Vinh, Nguyen X.
    Bailey, James
    PATTERN RECOGNITION, 2013, 46 (04) : 1220 - 1225
  • [18] Feature Selection Algorithm for Dynamically Weighted Conditional Mutual Information
    Zhang Li
    Chen Xiaobo
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (10) : 3028 - 3034
  • [19] CONDITIONAL DYNAMIC MUTUAL INFORMATION-BASED FEATURE SELECTION
    Liu, Huawen
    Mo, Yuchang
    Zhao, Jianmin
    COMPUTING AND INFORMATICS, 2012, 31 (06) : 1193 - 1216
  • [20] A Feature Selection Algorithm Based on Equal Interval Division and Conditional Mutual Information
    Gu, Xiangyuan
    Guo, Jichang
    Ming, Tao
    Xiao, Lijun
    Li, Chongyi
    NEURAL PROCESSING LETTERS, 2022, 54 (03) : 2079 - 2105