Pseudo-label neighborhood rough set: Measures and attribute reductions

被引:124
作者
Yang, Xibei [1 ,2 ]
Liang, Shaochen [1 ]
Yu, Hualong [1 ]
Gao, Shang [1 ]
Qian, Yuhua [3 ,4 ]
机构
[1] Jiangsu Univ Sci & Technol, Sch Comp, Zhenjiang 212003, Peoples R China
[2] Nanjing Univ Sci & Technol, Sch Econ & Management, Nanjing 210099, Jiangsu, Peoples R China
[3] Shanxi Univ, Sch Comp & Informat Technol, Taiyuan 030006, Shanxi, Peoples R China
[4] Shanxi Univ, Intelligent Informat Proc Key Lab Shanxi Prov, Taiyuan 030006, Shanxi, Peoples R China
关键词
Attribute reduction; Conditional discrimination index; Conditional entropy; Neighborhood decision error rate; Neighborhood rough set; Pseudo-label; FEATURE-SELECTION; CONDITIONAL ENTROPY; GRANULATION; APPROXIMATIONS; MODELS;
D O I
10.1016/j.ijar.2018.11.010
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The scale of the radius for constructing neighborhood relation has a great effect on the results of neighborhood rough sets and corresponding measures. A very small radius frequently brings us nothing because any two different samples are separated from each other, though these two samples have the same label. If the radius is growing, then there is a serious risk that samples with different labels may fall into the same neighborhood. Obviously, the radius based neighborhood relation does not take the labels of samples into account, which will lead to unsatisfactory discrimination. To fill such gap, a pseudo-label strategy is systematically studied in rough set theory. Firstly, a pseudo-label neighborhood relation is proposed. Such relation can differentiate samples by not only the distance but also the pseudo labels of samples. Therefore, both the neighborhood rough set and some corresponding measures can be re-defined. Secondly, attribute reductions are explored based on the re-defined measures. The heuristic algorithm is also designed to compute reducts. Finally, the experimental results over UCI data sets tell us that our pseudo-label strategy is superior to the traditional neighborhood approach. This is mainly because the former can significantly reduce the uncertainties and improve the classification accuracies. The Wilcoxon signed rank test results also show that neighborhood approach and pseudo-label neighborhood approach are so different from the viewpoints of the measures and attribute reductions in rough set theory. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:112 / 129
页数:18
相关论文
共 70 条
[1]  
[Anonymous], 2006 IEEE INT C
[2]  
[Anonymous], 2012, P 18 ACM SIGKDD INT
[3]   Attribute Reduction for Heterogeneous Data Based on the Combination of Classical and Fuzzy Rough Set Models [J].
Chen, Degang ;
Yang, Yanyan .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2014, 22 (05) :1325-1334
[4]   Parallel attribute reduction in dominance-based neighborhood rough set [J].
Chen, Hongmei ;
Li, Tianrui ;
Cai, Yong ;
Luo, Chuan ;
Fujita, Hamido .
INFORMATION SCIENCES, 2016, 373 :351-368
[5]   Generalized rough set models determined by multiple neighborhoods generated from a similarity relation [J].
Dai, Jianhua ;
Gao, Shuaichao ;
Zheng, Guojie .
SOFT COMPUTING, 2018, 22 (07) :2081-2094
[6]   Attribute selection based on a new conditional entropy for incomplete decision systems [J].
Dai, Jianhua ;
Wang, Wentao ;
Tian, Haowei ;
Liu, Liang .
KNOWLEDGE-BASED SYSTEMS, 2013, 39 :207-213
[7]   Conditional entropy for incomplete decision systems and its application in data mining [J].
Dai, Jianhua ;
Xu, Qing ;
Wang, Wentao ;
Tian, Haowei .
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2012, 41 (07) :713-728
[8]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[9]   Evidential clustering of large dissimilarity data [J].
Denoeux, Thierry ;
Sriboonchitta, Songsak ;
Kanjanatarakul, Orakanya .
KNOWLEDGE-BASED SYSTEMS, 2016, 106 :179-195
[10]   Quick general reduction algorithms for inconsistent decision tables [J].
Ge Hao ;
Li Longshu ;
Xu Yi ;
Yang Chuanjian .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 82 :56-80