Semi-supervised attribute reduction based on label distribution and label irrelevance

被引:19
作者
Dai, Jianhua [1 ]
Huang, Weiyi
Wang, Weisi
Zhang, Chucai
机构
[1] Hunan Normal Univ, Hunan Prov Key Lab Intelligent Comp & Language Inf, Changsha 410081, Peoples R China
关键词
Attribute reduction; Fuzzy similarity relation; Semi-supervised; Label distribution; Label irrelevance; ROUGH SET-THEORY; FEATURE-SELECTION; KNOWLEDGE GRANULATION; CONDITIONAL-ENTROPY;
D O I
10.1016/j.inffus.2023.101951
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Attribute reduction in partially labeled data, also called semi-supervised attribute reduction, is an important issue. In recent years, the research on semi-supervised attribute reduction has attracted the attention of many scholars. Unfortunately, most existing semi-supervised attribute reduction methods do not handle the information loss caused by missing labels well. Meanwhile, these methods in general only consider the relevance between attributes and labels to measure attribute correlations, which ignores the irrelevant information contained in the attributes with respect to the labels. In view of this, this paper proposes a novel semi-supervised attribute reduction algorithm considering attribute relevance, redundancy and label irrelevance from the perspective of label distribution. Firstly, the membership degree of unlabeled objects relative to labels is defined by fuzzy similarity relation, which implements information restoration and converts partially labeled data into label distribution data. Secondly, some fuzzy uncertainty measures for label distribution are defined and related properties are investigated accordingly. Additionally, considering that irrelevant information brought by attributes may lead to over-fitting, label irrelevance criterion based on fuzzy uncertainty measures is constructed. Thirdly, a novel semi-supervised attribute reduction algorithm via the maximum relevance, minimum redundancy, and minimum irrelevance is proposed. Finally, compared with the representative semi-supervised attribute reduction algorithms and supervised attribute reduction algorithm, the effectiveness of the proposed algorithm is verified by various experiments.
引用
收藏
页数:16
相关论文
共 45 条
[1]   A semi-supervised feature ranking method with ensemble learning [J].
Bellal, Fazia ;
Elghazel, Haytham ;
Aussem, Alex .
PATTERN RECOGNITION LETTERS, 2012, 33 (10) :1426-1433
[2]   A rough set approach to attribute generalization in data mining [J].
Chan, CC .
INFORMATION SCIENCES, 1998, 107 (1-4) :169-176
[3]   A Decision-Theoretic Rough Set Approach for Dynamic Data Mining [J].
Chen, Hongmei ;
Li, Tianrui ;
Luo, Chuan ;
Horng, Shi-Jinn ;
Wang, Guoyin .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2015, 23 (06) :1958-1970
[4]   Feature selection via normative fuzzy information weight with application into tumor classification [J].
Dai, Jianhua ;
Chen, Jiaolong .
APPLIED SOFT COMPUTING, 2020, 92
[5]   Attribute Selection for Partially Labeled Categorical Data By Rough Set Approach [J].
Dai, Jianhua ;
Hu, Qinghua ;
Zhang, Jinghong ;
Hu, Hu ;
Zheng, Nenggan .
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (09) :2460-2471
[6]   DualPOS: A Semi-supervised Attribute Selection Approach for Symbolic Data Based on Rough Set Theory [J].
Dai, Jianhua ;
Han, Huifeng ;
Hu, Hu ;
Hu, Qinghua ;
Zhang, Jinghong ;
Wang, Wentao .
WEB-AGE INFORMATION MANAGEMENT, PT II, 2016, 9659 :392-402
[7]   Entropy measures and granularity measures for set-valued information systems [J].
Dai, Jianhua ;
Tian, Haowei .
INFORMATION SCIENCES, 2013, 240 :72-82
[8]   Attribute selection based on a new conditional entropy for incomplete decision systems [J].
Dai, Jianhua ;
Wang, Wentao ;
Tian, Haowei ;
Liu, Liang .
KNOWLEDGE-BASED SYSTEMS, 2013, 39 :207-213
[9]   Attribute selection based on information gain ratio in fuzzy rough set theory with application to tumor classification [J].
Dai, Jianhua ;
Xu, Qing .
APPLIED SOFT COMPUTING, 2013, 13 (01) :211-221
[10]  
Demsar J, 2006, J MACH LEARN RES, V7, P1