A comparative analysis of methods for pruning decision trees

被引:306
作者
Esposito, F
Malerba, D
Semeraro, G
机构
[1] Dipartimento di Infarmatica, Università Degli Studi di Bari, 70126 Bari
关键词
decision trees; top-down induction of decision trees; simplification of decision trees; pruning and grafting operators; optimal pruning; comparative studies;
D O I
10.1109/34.589207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we address the problem of retrospectively pruning decision trees induced from data, according to a top-down approach. This problem has received considerable attention in the areas of pattern recognition and machine learning, and many distinct methods have been proposed in literature. We make a comparative study of six well-known pruning methods with the aim of understanding their theoretical foundations, their computational complexity, and the strengths and weaknesses of their formulation. Comments on the characteristics of each method are empirically supported. In particular, a wide experimentation performed on several data sets leads us to opposite conclusions on the predictive accuracy of simplified trees from some drawn in the literature. We attribute this divergence to differences in experimental designs. Finally, we prove and make use of a property of the reduced error pruning method to obtain an objective evaluation of the tendency to overprune/underprune observed in each method.
引用
收藏
页码:476 / 491
页数:16
相关论文
共 50 条
[21]   A comparison of predictive methods in extinction risk studies: Contrasts and decision trees [J].
Sullivan, Matthew S. ;
Jones, Martin J. ;
Lee, David C. ;
Marsden, Stuart J. ;
Fielding, Alan H. ;
Young, Emily V. .
BIODIVERSITY AND CONSERVATION, 2006, 15 (06) :1977-1991
[22]   Decision trees [J].
de Ville, Barry .
WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2013, 5 (06) :448-455
[23]   A Comparison of Predictive Methods in Extinction Risk Studies: Contrasts and Decision Trees [J].
Matthew S. Sullivan ;
Martin J. Jones ;
David C. Lee ;
Stuart J. Marsden ;
Alan H. Fielding ;
Emily V. Young .
Biodiversity & Conservation, 2006, 15 :1977-1991
[24]   Decision trees to multiclass prediction for analysis of arecanut data [J].
Suresha, M. ;
Danti, Ajit ;
Narasimhamurthy, S. K. .
COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2014, 29 (01) :105-114
[25]   Decision Trees in Stock Market Analysis: Construction and Validation [J].
Miro-Julia, Margaret ;
Fiol-Roig, Gabriel ;
Pere Isern-Deya, Andreu .
TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS, 2010, 6096 :185-194
[26]   Risk analysis in electricity markets by using decision trees [J].
Mosquera, N. ;
Reneses, J. ;
Barquin, J. ;
Sanchez-Ubeda, E. F. .
2006 INTERNATIONAL CONFERENCE ON PROBABILISTIC METHODS APPLIED TO POWER SYSTEMS, VOLS 1 AND 2, 2006, :960-966
[27]   Analysis of Renewable Energy Policies through Decision Trees [J].
Ortiz, Dania ;
Migueis, Vera ;
Leal, Vitor ;
Knox-Hayes, Janelle ;
Chun, Jungwoo .
SUSTAINABILITY, 2022, 14 (13)
[28]   Analysis of traffic accident severity using Decision Rules via Decision Trees [J].
Abellan, Joaquin ;
Lopez, Griselda ;
de Ona, Juan .
EXPERT SYSTEMS WITH APPLICATIONS, 2013, 40 (15) :6047-6054
[29]   A fast algorithm of intra prediction modes pruning for HEVC based on decision trees and a new three-step search [J].
Shiping Zhu ;
Chunyan Zhang .
Multimedia Tools and Applications, 2017, 76 :21707-21728
[30]   A fast algorithm of intra prediction modes pruning for HEVC based on decision trees and a new three-step search [J].
Zhu, Shiping ;
Zhang, Chunyan .
MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (20) :21707-21728