Generalized Information-Theoretic Criterion for Multi-Label Feature Selection

被引:14
作者
Seo, Wangduk [1 ]
Kim, Dae-Won [1 ]
Lee, Jaesung [1 ]
机构
[1] Chung Ang Univ, Sch Comp Sci & Engn, Seoul 06974, South Korea
基金
新加坡国家研究基金会;
关键词
Machine learning; multi-label learning; multi-label feature selection; information entropy; MUTUAL INFORMATION; CLASSIFICATION; ALGORITHM;
D O I
10.1109/ACCESS.2019.2927400
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Multi-label feature selection that identifies important features from the original feature set of multi-labeled datasets has been attracting considerable attention owing to its generality compared to conventional single-label feature selection. The unimportant features are filtered by scoring the dependency of features to labels. In conventional multi-label feature filter studies, the score function is obtained by approximating a dependency measure such as joint entropy because direct calculation is often impractical due to the presence of multiple labels with limited training patterns. Although the efficacy of approximation can differ depending on the characteristics of the multi-label dataset, conventional methods presume a certain approximation method, leading to a degenerated feature subset if the presumed approximation is inappropriate for the given dataset. In this study, we propose a strategy for selecting an approximation among a series of approximations depending on the number of involved variables and consequently instantiate a score function based on the chosen approximation. The experimental results demonstrate that the proposed strategy and score function outperform conventional multi-label feature selection methods.
引用
收藏
页码:122854 / 122863
页数:10
相关论文
共 36 条
[1]   Learning multi-label scene classification [J].
Boutell, MR ;
Luo, JB ;
Shen, XP ;
Brown, CM .
PATTERN RECOGNITION, 2004, 37 (09) :1757-1771
[2]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[3]  
Diplaris S, 2005, LECT NOTES COMPUT SC, V3746, P448
[4]   Mutual information-based feature selection for multilabel classification [J].
Doquire, Gauthier ;
Verleysen, Michel .
NEUROCOMPUTING, 2013, 122 :148-155
[5]   NONNEGATIVE ENTROPY MEASURES OF MULTIVARIATE SYMMETRIC CORRELATIONS [J].
HAN, TS .
INFORMATION AND CONTROL, 1978, 36 (02) :133-156
[6]   Manifold-based constraint Laplacian score for multi-label feature selection [J].
Huang, Rui ;
Jiang, Weidong ;
Sun, Guangling .
PATTERN RECOGNITION LETTERS, 2018, 112 :346-352
[7]  
Kashef S., 2018, REV DATA MINING KNOW, V8, pe1240
[8]  
Klimt B, 2004, LECT NOTES COMPUT SC, V3201, P217
[9]   gMLC: a multi-label feature selection framework for graph classification [J].
Kong, Xiangnan ;
Yu, Philip S. .
KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 31 (02) :281-305
[10]   Approximating mutual information for multi-label feature selection [J].
Lee, J. ;
Lim, H. ;
Kim, D. -W. .
ELECTRONICS LETTERS, 2012, 48 (15) :929-930