An Improved C4.5 Algorthm in Bagging Integration Model

被引:4
作者
Song, Yu-Qing [1 ]
Yao, Xu [1 ,2 ]
Liu, Zhe [1 ]
Shen, Xianbao [1 ]
Mao, Jingyi [1 ]
机构
[1] Jiangsu Univ, Sch Comp Sci & Telecommun, Zhenjiang 212013, Jiangsu, Peoples R China
[2] Jiangsu Univ Sci & Technol, Sch Comp Sci, Zhenjiang 212013, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Bagging integration; C4.5; algorithm; information entropy; split information; DECISION TREE; CREDAL C4.5; ENSEMBLE; CLASSIFIER;
D O I
10.1109/ACCESS.2020.3032291
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The C4.5 algorithm has three shortcomings: the wide range of candidate segmentation threshold sequences for continuous attributes, the comprehensive influence of different attributes and local subsets under the same attribute, and the inter-attribute redundancy. When dealing with continuous attributes, sampling and threshold supplement processing near the transition boundary of the attribute interval corresponding to the adjacent different categories are performed for narrowing the range of candate segmentation threshold sequences. By adding standardizing Euclidean distance of the attribute global and local factors to represent attribute weight, the calculation of C4.5 information gain is otpimized. And averaging Gini index of other attributes and adding correction factor, the influence of redundancy between attributes is greatly decreased. The overall average improvement range of the base classifier and the bagging integration classifier is 0.6%similar to 2.1% and 0.7% similar to 2.7%, respectively, which shows that this integration model can improve the classification accuracy and also validate its feasibility and reliability.
引用
收藏
页码:206866 / 206875
页数:10
相关论文
共 22 条
  • [1] AdaptativeCC4.5: Credal C4.5 with a rough class noise estimator
    Abelian, Joaquin
    Mantas, Carlos J.
    Castellano, Javier G.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2018, 92 : 363 - 379
  • [2] Bin N., 2019, J JIANGXI NORMAL U N, V43, P469
  • [3] Performance of Decision Tree C4.5 Algorithm in Student Academic Evaluation
    Budiman, Edy
    Haviluddin
    Dengan, Nataniel
    Kridalaksana, Awang Harsa
    Wati, Masna
    Purnawansyah
    [J]. COMPUTATIONAL SCIENCE AND TECHNOLOGY, ICCST 2017, 2018, 488 : 380 - 389
  • [4] Very Fast C4.5 Decision Tree Algorithm
    Cherfi, Anis
    Nouira, Kaouther
    Ferchichi, Ahmed
    [J]. APPLIED ARTIFICIAL INTELLIGENCE, 2018, 32 (02) : 119 - 137
  • [5] DRCW-OVO: Distance-based relative competence weighting combination for One-vs-One strategy in multi-class problems
    Galar, Mikel
    Fernandez, Alberto
    Barrenechea, Edurne
    Herrera, Francisco
    [J]. PATTERN RECOGNITION, 2015, 48 (01) : 28 - 42
  • [6] Gata W., 2019, P 2 INT C RES ED ADM, P161
  • [7] Hao H, 2018, PROCEEDINGS OF 2018 IEEE 3RD ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC 2018), P1882, DOI 10.1109/IAEAC.2018.8577527
  • [8] Hong DH, 2006, LECT NOTES COMPUT SC, V4223, P241
  • [9] Adaptive Structural Learning of Deep Belief Network for Medical Examination Data and Its Knowledge Extraction by using C4.5
    Kamada, Shin
    Ichimura, Takumi
    Harada, Toshihide
    [J]. 2018 IEEE FIRST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE), 2018, : 33 - 40
  • [10] A framework for sensitivity analysis of decision trees
    Kaminski, Bogumil
    Jakubczyk, Michal
    Szufel, Przemyslaw
    [J]. CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH, 2018, 26 (01) : 135 - 159