Multi-Granularity Ensemble Classification Algorithm Based on Attribute Representation

被引:0
作者
Zhang Q.-H. [1 ,2 ]
Zhi X.-C. [1 ,2 ]
Wang G.-Y. [1 ,2 ]
Yang F. [3 ]
Xue F.-Z. [3 ]
机构
[1] Key Laboratory of Tourism Multisource Data Perception and Decision, Ministry of Culture and Tourism, Chongqing
[2] Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing
[3] School of Public Health, Shandong University
来源
Jisuanji Xuebao/Chinese Journal of Computers | 2022年 / 45卷 / 08期
基金
中国国家自然科学基金;
关键词
Attribute representation; Dynamic classification; Ensemble learning; Multi-granularity; Sequential three-way decisions;
D O I
10.11897/SP.J.1016.2022.01712
中图分类号
学科分类号
摘要
In the face of the complex and changeable information systems, in the field of machine learning traditional multi-classification models cannot achieve a dynamic classification process, and it cannot solve some problems such as disease diagnosis. Because some diagnosis procedures are too expensive, it is necessary to judge whether the patient is likely to be ill through some preliminary diagnosis, thereby reducing the cost of the process. Sequential three-way decisions as a multi-granularity classification algorithm, which is used to solve dynamic classification problems in multi-granularity space. The sequential three-way decision model sorts attributes by balancing the cost of decision results and decision process, then a multi-level granularity space is constructed. With the injection of information in turn, objects that meet the conditions are classified at different granularity levels. It can be said that the sequential three-way decision model solves the problem of excessive costs for decision process. Therefore, many scholars at home and abroad have optimized the sequential three-way decision model from perspective of cost-sensitive. However, in some cases, the sequential three-way decision model in the coarse granularity space is prone to decision conflicts, that is, the same object gets multiple different classification results. Therefore, many attributes must be considered in the fine granularity space, which leads to low classification efficiency. Because of the lack of more information and corresponding strategies, the sequential three-way decision model is unable to process the final unclassified objects. Therefore, this paper combines the ideas of ensemble learning and granular computing to propose a multi-granularity ensemble classification algorithm based on attribute representation. Firstly, constructing a classifier by selecting the attribute representatives of each granularity layer to form an ensemble classifier based on attribute representatives. By synthesizing the different opinions of the classifier which is constructed by attribute representation, the generation of decision conflicts in each granularity layer can be effectively reduced. Secondly, the classification opinions of classifiers in the coarse granularity space are retained through the scoring table to reduce the number of attributes that need to be considered in the fine granularity space. The retained score can make the ensemble classifier which is constructed by attribute representation in the fine granularity space to avoid more likely errors, thereby obtaining a more confident classification result. Finally, there may still be cases where some objects are not classified after all the information has been injected, so the "relatively optimal" strategy is adopted, and the decision class with the least objection rate is used as the final classification result of unclassified objects. In order to verify the validity of the model in this paper, the 14 UCI data sets and 6 real data sets which are related to medical diagnosis are used to conduct horizontal and vertical comparison experiments respectively. Among them, the horizontal comparison experiment includes ten popular multi-classification algorithms. Through experiments, the proposed method in this paper has better robustness, classification efficiency and classification performance than the sequential three-way decisions and other machine learning multi-classification algorithms. Moreover, the multi-granularity ensemble classification algorithm based on attribute representation has improved significantly in the real data sets of medical diagnosis. © 2022, Science Press. All right reserved.
引用
收藏
页码:1712 / 1729
页数:17
相关论文
共 38 条
[1]  
Zhou Zhi-Hua, Machine Learning, (2016)
[2]  
Dietterich T G., Ensemble methods in machine learning, Proceedings of the 1st International Workshop on Multiple Classifier Systems(MCS 2000), pp. 1-15, (2000)
[3]  
Galar M, Fernandez A, Barrenechea E, Et al., An overview of ensemble methods for binary classifiers in multi-class problems: Experimental study on one-vs-one and one-vs-all schemes, Pattern Recognition, 44, 8, pp. 1761-1776, (2011)
[4]  
Yu Si-Hao, Guo Jia-Feng, Fan Yi-Xing, Et al., Multi classifier ensemble algorithm based on knowledge-line memory, Chinese Journal of Computers, 44, 3, pp. 462-475, (2021)
[5]  
West D, Dellana S, Qian J., Neural network ensemble strategies for financial decision applications, Computers & Operations Research, 32, 10, pp. 2543-2559, (2005)
[6]  
Florez-Lopez R, Ramon-Jeronimo J M., Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal, Expert Systems with Applications, 42, 13, pp. 5737-5753, (2015)
[7]  
Zhang Y, Zhang B, Coenen F, Et al., One-class kernel subspace ensemble for medical image classification, EURASIP Journal on Advances in Signal Processing, 2014, 1, pp. 1-13, (2014)
[8]  
Muzammal M, Talat R, Sodhro A H, Et al., A multi-sensor data fusion enabled ensemble approach for medical data from body sensor networks, Information Fusion, 53, pp. 155-164, (2020)
[9]  
Fraz M M, Remagnino P, Hoppe A., An ensemble classification-based approach applied to retinal blood vessel segmentation, IEEE Transactions on Biomedical Engineering, 59, 9, pp. 2538-2548, (2012)
[10]  
Yao Y Y., Three-way decisions with probabilistic rough sets, Information Sciences, 180, 3, pp. 341-353, (2010)