Feature-specific mutual information variation for multi-label feature selection

被引:80
作者
Hu, Liang [1 ,2 ]
Gao, Lingbo [1 ,2 ]
Li, Yonghao [1 ,2 ]
Zhang, Ping [1 ,2 ]
Gao, Wanfu [1 ,2 ,3 ]
机构
[1] Jilin Univ, Coll Comp Sci & Technol, Changchun 130012, Peoples R China
[2] Jilin Univ, Key Lab Symbol Computat & Knowledge Engn, Minist Educ, Changchun 130012, Peoples R China
[3] Jilin Univ, Coll Chem, Changchun 130012, Peoples R China
关键词
Multi-label feature selection; Information theory; Feature relevance; Changed ratio; Relevance based weight; ALGORITHM;
D O I
10.1016/j.ins.2022.02.024
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent years has witnessed urgent needs for addressing the curse of dimensionality regarding multi-label data, which attracts wide attention for feature selection. Feature relevance terms are often constructed depending on the amount of information contributed by selected features or candidate features to the label set in previous multi-label feature selection approaches based on information theory. Although it is important to consider the amount of information, they ignore both the changed ratio for the undetermined amount of information and the changed ratio for the established amount of information, two types of changed ratios regarding feature relevance evaluation cannot be underestimated. To this end, we devise a new feature relevance term, Relevance based on Weight (RW), which is based on two types of changed ratios. Both two types of changed ratios have positive or negative impacts regarding feature relevance evaluation. A novel multi-label feature selection approach, Relevance based on Weight Feature Selection (RWFS), is proposed based on RW. To verify the effectiveness, the proposed approach is compared to eight state-of-the-art multi-label approaches on thirteen real-world data sets. The experimental results present that RWFS approach has superior performance than other eight compared approaches. (C) 2022 Elsevier Inc. All rights reserved.
引用
收藏
页码:449 / 471
页数:23
相关论文
共 45 条
[1]   Multi-label Arabic text categorization: A benchmark and baseline comparison of multi-label learning algorithms [J].
Al-Salemi, Bassam ;
Ayob, Masri ;
Kendall, Graham ;
Noah, Shahrul Azman Mohd .
INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (01) :212-227
[2]  
[Anonymous], 2008, P 2008 NZ COMP SCI R
[3]  
[Anonymous], 2010, LIT SURVEY ALGORITHM
[4]   R-HEFS: Rough set based heterogeneous ensemble feature selection method for medical data classification [J].
Bania, Rubul Kumar ;
Halder, Anindya .
ARTIFICIAL INTELLIGENCE IN MEDICINE, 2021, 114
[5]   R-Ensembler: A greedy rough set based ensemble attribute selection algorithm with kNN imputation for classification of medical data [J].
Bania, Rubul Kumar ;
Halder, Anindya .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2020, 184
[6]   Learning multi-label scene classification [J].
Boutell, MR ;
Luo, JB ;
Shen, XP ;
Brown, CM .
PATTERN RECOGNITION, 2004, 37 (09) :1757-1771
[7]  
Doquire G, 2011, LECT NOTES COMPUT SC, V6691, P9, DOI 10.1007/978-3-642-21501-8_2
[8]  
Gao W., IEEE T ARTIF INTELL
[9]  
Gao W., IEEE T NEURAL NETWOR
[10]   Feature Redundancy Based on Interaction Information for Multi-Label Feature Selection [J].
Gao, Wanfu ;
Hu, Juncheng ;
Li, Yonghao ;
Zhang, Ping .
IEEE ACCESS, 2020, 8 :146050-146064