Fast and Robust Attribute Reduction Based on the Separability in Fuzzy Decision Systems

被引:73
作者
Hu, Meng [1 ]
Tsang, Eric C. C. [1 ]
Guo, Yanting [1 ]
Xu, Weihua [2 ]
机构
[1] Macau Univ Sci & Technol, Fac Informat Technol, Macau, Peoples R China
[2] Southwest Univ, Coll Artificial Intelligence, Chongqing 400715, Peoples R China
关键词
Entropy; Kernel; Task analysis; Rough sets; Machine learning; Redundancy; Mutual information; Attribute reduction; fuzzy decision systems; fuzzy membership; separability; FEATURE-SELECTION; MUTUAL INFORMATION; ENTROPY; CLASSIFICATION; UNCERTAINTY; CLASSIFIERS; SETS;
D O I
10.1109/TCYB.2020.3040803
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Attribute reduction is one of the most important preprocessing steps in machine learning and data mining. As a key step of attribute reduction, attribute evaluation directly affects classification performance, search time, and stopping criterion. The existing evaluation functions are greatly dependent on the relationship between objects, which makes its computational time and space more costly. To solve this problem, we propose a novel separability-based evaluation function and reduction method by using the relationship between objects and decision categories directly. The degree of aggregation (DA) of intraclass objects and the degree of dispersion (DD) of between-class objects are first defined to measure the significance of an attribute subset. Then, the separability of attribute subsets is defined by DA and DD in fuzzy decision systems, and we design a sequentially forward selection based on the separability (SFSS) algorithm to select attributes. Furthermore, a postpruning strategy is introduced to prevent overfitting and determine a termination parameter. Finally, the SFSS algorithm is compared with some typical reduction algorithms using some public datasets from UCI and ELVIRA Biomedical repositories. The interpretability of SFSS is directly presented by the performance on MNIST handwritten digits. The experimental comparisons show that SFSS is fast and robust, which has higher classification accuracy and compression ratio, with extremely low computational time.
引用
收藏
页码:5559 / 5572
页数:14
相关论文
共 48 条
[1]   On the Feature Selection Criterion Based on an Approximation of Multidimensional Mutual Information [J].
Balagani, Kiran S. ;
Phoha, Vir V. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (07) :1342-1343
[2]   TRADING ACCURACY FOR SIMPLICITY IN DECISION TREES [J].
BOHANEC, M ;
BRATKO, I .
MACHINE LEARNING, 1994, 15 (03) :223-250
[3]   Feature selection, mutual information, and the classification of high-dimensional patterns [J].
Bonev, Boyan ;
Escolano, Francisco ;
Cazorla, Miguel .
PATTERN ANALYSIS AND APPLICATIONS, 2008, 11 (3-4) :309-319
[4]  
Cano A., 2005, ELVIRA BIOMEDICAL DA
[5]   Asymptotic Fuzzy Neural Network Control for Pure-Feedback Stochastic Systems Based on a Semi-Nussbaum Function Technique [J].
Chen, Ci ;
Liu, Zhi ;
Xie, Kan ;
Zhang, Yun ;
Chen, C. L. Philip .
IEEE TRANSACTIONS ON CYBERNETICS, 2017, 47 (09) :2448-2459
[6]   A Novel Algorithm for Finding Reducts With Fuzzy Rough Sets [J].
Chen, Degang ;
Zhang, Lei ;
Zhao, Suyun ;
Hu, Qinghua ;
Zhu, Pengfei .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2012, 20 (02) :385-389
[7]   Parameterized attribute reduction with Gaussian kernel based fuzzy rough sets [J].
Chen, Degang ;
Hu, Qinghua ;
Yang, Yongping .
INFORMATION SCIENCES, 2011, 181 (23) :5169-5179
[8]   Fuzzy Kernel Alignment With Application to Attribute Reduction of Heterogeneous Data [J].
Chen, Linlin ;
Chen, Degang ;
Wang, Hui .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2019, 27 (07) :1469-1478
[9]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[10]   Minimum redundancy feature selection from microarray gene expression data [J].
Ding, C ;
Peng, HC .
PROCEEDINGS OF THE 2003 IEEE BIOINFORMATICS CONFERENCE, 2003, :523-528