Multilabel all-relevant feature selection using lower bounds of conditional mutual information

被引:7
作者
Teisseyre, Pawel [1 ,2 ]
Lee, Jaesung [3 ,4 ]
机构
[1] Polish Acad Sci, Inst Comp Sci, Warsaw, Poland
[2] Warsaw Univ Technol, Fac Math & Informat Sci, Warsaw, Poland
[3] Chung Ang Univ, Dept Artificial Intelligence, Seoul, South Korea
[4] Chung Ang Univ, AI ML Res Innovat Ctr, Seoul, South Korea
关键词
Multilabel data analysis; Feature selection; Information theory; Conditional mutual information; Permutation tests; LABEL FEATURE-SELECTION; EFFICIENT FEATURE-SELECTION; GENE-GENE INTERACTIONS; CLASSIFIER CHAINS; DETECT;
D O I
10.1016/j.eswa.2022.119436
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider a multilabel all-relevant feature selection task which is more general than the classical minimal -optimal subset task. Whereas the goal of the minimal-optimal methods is to find the smallest subset of features allowing accurate prediction of labels, the objective of the all-relevant methods is to identify all the features that are related to the target labels, including strongly and all weakly relevant features. The all-relevant task has received much interest in the fields where discovering the dependency structure between features and target variables is more important than the prediction itself, e.g., in medical and bioinformatics applications. In this paper, we formally describe the all-relevant problem for multi-label classification using an information -theoretic approach. We propose a relevancy score and an efficient method of its calculation based on the lower bounds of conditional mutual information. Another practical issue is how to separate the relevant features from irrelevant ones. To find a threshold, we propose a testing procedure based on a permutation scheme. Finally, empirical evaluation of all-relevant methods requires a specific approach. We consider a large variety of simulated datasets representing different dependency structures and containing various types of interactions. Empirical results on simulated datasets and a large clinical database demonstrate that the proposed method can successfully identify relevant features.
引用
收藏
页数:19
相关论文
共 50 条
[31]   Conditional mutual information-based feature selection algorithm for maximal relevance minimal redundancy [J].
Gu, Xiangyuan ;
Guo, Jichang ;
Xiao, Lijun ;
Li, Chongyi .
APPLIED INTELLIGENCE, 2022, 52 (02) :1436-1447
[32]   Conditional mutual information-based feature selection algorithm for maximal relevance minimal redundancy [J].
Xiangyuan Gu ;
Jichang Guo ;
Lijun Xiao ;
Chongyi Li .
Applied Intelligence, 2022, 52 :1436-1447
[33]   PCMINN: A GPU-Accelerated Conditional Mutual Information-Based Feature Selection Method [J].
Papaioannou, Nikolaos ;
Myllis, Georgios ;
Tsimpiris, Alkiviadis ;
Aggelopoulos, Stamatis ;
Vrana, Vasiliki .
INFORMATION, 2025, 16 (06)
[34]   Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification [J].
Sun, Lin ;
Wang, Tianxiang ;
Ding, Weiping ;
Xu, Jiucheng ;
Lin, Yaojin .
INFORMATION SCIENCES, 2021, 578 :887-912
[35]   Effective feature selection scheme using mutual information [J].
Huang, D ;
Chow, TWS .
NEUROCOMPUTING, 2005, 63 :325-343
[36]   Feature selection using mutual information in CT colonography [J].
Ong, Ju Lynn ;
Seghouane, Abd-Krim .
PATTERN RECOGNITION LETTERS, 2011, 32 (02) :337-341
[37]   Using Mutual Information for Feature Selection in Programmatic Advertising [J].
Ciesielczyk, Michal .
2017 IEEE INTERNATIONAL CONFERENCE ON INNOVATIONS IN INTELLIGENT SYSTEMS AND APPLICATIONS (INISTA), 2017, :290-295
[38]   Feature selection using Decomposed Mutual Information Maximization [J].
Macedo, Francisco ;
Valadas, Rui ;
Carrasquinha, Eunice ;
Oliveira, M. Rosario ;
Pacheco, Antonio .
NEUROCOMPUTING, 2022, 513 :215-232
[39]   Feature Selection Using Mutual Information: An Experimental Study [J].
Liu, Huawen ;
Liu, Lei ;
Zhang, Huijie .
PRICAI 2008: TRENDS IN ARTIFICIAL INTELLIGENCE, 2008, 5351 :235-246
[40]   Feature Selection by Maximizing Part Mutual Information [J].
Gao, Wanfu ;
Hu, Liang ;
Zhang, Ping .
2018 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MACHINE LEARNING (SPML 2018), 2018, :120-127