A comparative analysis of methods for pruning decision trees

被引:306
作者
Esposito, F
Malerba, D
Semeraro, G
机构
[1] Dipartimento di Infarmatica, Università Degli Studi di Bari, 70126 Bari
关键词
decision trees; top-down induction of decision trees; simplification of decision trees; pruning and grafting operators; optimal pruning; comparative studies;
D O I
10.1109/34.589207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a top-down approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. We make a comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation. Comments on the characteristics of each method are empirically supported. In particular, a wide experimentation performed on several data sets leads us to opposite conclusions on the predictive accuracy of simplified trees from some drawn in the literature. We attribute this divergence to differences in experimental designs. Finally, we prove and make use of a property of the reduced error pruning method to obtain an objective evaluation of the tendency to overprune/underprune observed in each method.
引用
收藏
页码:476 / 491
页数:16
相关论文
共 50 条
[31]   A Comparative Analysis of Decision Trees, Support Vector Machines and Artificial Neural Networks for On-line Transient Stability Assessment [J].
Gregory Baltas, Nicholas ;
Mazidi, Peyman ;
Ma, Jin ;
de Asis Fernandez, Francisco ;
Rodriguez, Pedro .
2018 INTERNATIONAL CONFERENCE ON SMART ENERGY SYSTEMS AND TECHNOLOGIES (SEST), 2018,
[32]   Peculiarities of applying methods based on decision trees in the problems of real estate valuation [J].
Laskin, Mikhail B. ;
Gadasina, Lyudmila, V .
BIZNES INFORMATIKA-BUSINESS INFORMATICS, 2022, 16 (04) :7-18
[33]   Certifying Decision Trees Against Evasion Attacks by Program Analysis [J].
Calzavara, Stefano ;
Ferrara, Pietro ;
Lucchese, Claudio .
COMPUTER SECURITY - ESORICS 2020, PT II, 2020, 12309 :421-438
[34]   On optimization of decision trees [J].
Chikalov, IV ;
Moshkov, MJ ;
Zelentsova, MS .
TRANSACTIONS ON ROUGH SETS IV, 2005, 3700 :18-36
[35]   CODING DECISION TREES [J].
WALLACE, CS ;
PATRICK, JD .
MACHINE LEARNING, 1993, 11 (01) :7-22
[36]   Optimization and analysis of decision trees and rules: dynamic programming approach [J].
Alkhalid, Abdulaziz ;
Amin, Talha ;
Chikalov, Igor ;
Hussain, Shahid ;
Moshkov, Mikhail ;
Zielosko, Beata .
INTERNATIONAL JOURNAL OF GENERAL SYSTEMS, 2013, 42 (06) :614-634
[37]   Comparison of Discriminant Analysis and Decision Trees for the Detection of Subclinical Keratoconus [J].
Kleinhans, Sonja ;
Herrmann, Eva ;
Kohnen, Thomas ;
Buehren, Jens .
KLINISCHE MONATSBLATTER FUR AUGENHEILKUNDE, 2019, 236 (06) :798-805
[38]   Analysis of NIR spectroscopic data using decision trees and their ensembles [J].
Kucheryavskiy S. .
Journal of Analysis and Testing, 2018, 2 (03) :274-289
[39]   Dynamic Security Analysis For Voltage Security Using Decision Trees [J].
Chaudhari, Nikhil ;
Hinge, Trupti ;
Dambhare, Sanjay .
PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, :888-892
[40]   Analysis of Topographic Maps for Recreational Purposes using Decision Trees [J].
Kirby, Richard ;
Henderson, Thomas C. .
2013 12TH INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), 2013, :1105-1109