A comparative analysis of methods for pruning decision trees

被引:303
作者
Esposito, F
Malerba, D
Semeraro, G
机构
[1] Dipartimento di Infarmatica, Università Degli Studi di Bari, 70126 Bari
关键词
decision trees; top-down induction of decision trees; simplification of decision trees; pruning and grafting operators; optimal pruning; comparative studies;
D O I
10.1109/34.589207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a top-down approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. We make a comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation. Comments on the characteristics of each method are empirically supported. In particular, a wide experimentation performed on several data sets leads us to opposite conclusions on the predictive accuracy of simplified trees from some drawn in the literature. We attribute this divergence to differences in experimental designs. Finally, we prove and make use of a property of the reduced error pruning method to obtain an objective evaluation of the tendency to overprune/underprune observed in each method.
引用
收藏
页码:476 / 491
页数:16
相关论文
共 50 条
  • [1] Restricted multi-pruning of decision trees
    Azad, Mohammad
    Chikalov, Igor
    Moshkov, Mikhail
    Hussain, Shahid
    DATA SCIENCE AND KNOWLEDGE ENGINEERING FOR SENSING DECISION SUPPORT, 2018, 11 : 371 - 378
  • [2] A dynamic programming based pruning method for decision trees
    Li, XB
    Sweigart, J
    Teng, J
    Donohue, J
    Thombs, L
    INFORMS JOURNAL ON COMPUTING, 2001, 13 (04) : 332 - 344
  • [3] Selective Rademacher penalization and reduced error pruning of decision trees
    Kääriäinen, M
    Malinen, T
    Elomaa, T
    JOURNAL OF MACHINE LEARNING RESEARCH, 2004, 5 : 1107 - 1126
  • [4] A heuristic for learning decision trees and pruning them into classification rules
    Ranilla, J
    Luaces, O
    Bahamonde, A
    AI COMMUNICATIONS, 2003, 16 (02) : 71 - 87
  • [5] Landslide Susceptibility Assessment Using Bagging Ensemble Based Alternating Decision Trees, Logistic Regression and J48 Decision Trees Methods: A Comparative Study
    Pham B.T.
    Tien Bui D.
    Prakash I.
    Geotechnical and Geological Engineering, 2017, 35 (6) : 2597 - 2611
  • [6] Game Trees For Decision Analysis
    Prakash P. Shenoy
    Theory and Decision, 1998, 44 : 149 - 171
  • [7] Game trees for decision analysis
    Shenoy, PP
    THEORY AND DECISION, 1998, 44 (02) : 149 - 171
  • [8] Support Vector Machine Pre-pruning Approaches on Decision Trees for Better Classification
    Sim, Doreen Ying Ying
    PROCEEDINGS OF 2019 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND ELECTRICAL ENGINEERING TECHNOLOGY (EEET 2019), 2019, : 30 - 36
  • [9] Are decision trees way around some statistic methods?
    Zorman, M
    Kokol, P
    Stiglic, MM
    Gregoric, A
    PROCEEDINGS OF THE 20TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOL 20, PTS 1-6: BIOMEDICAL ENGINEERING TOWARDS THE YEAR 2000 AND BEYOND, 1998, 20 : 1198 - 1201
  • [10] A framework for sensitivity analysis of decision trees
    Kaminski, Bogumil
    Jakubczyk, Michal
    Szufel, Przemyslaw
    CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH, 2018, 26 (01) : 135 - 159