An efficient algorithm for optimal pruning of decision trees

被引:22
作者
Almuallim, H
机构
[1] Info. and Comp. Science Department, King Fahd Univ. Petrol. and Minerals, Dhahran 31261
关键词
D O I
10.1016/0004-3702(95)00060-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pruning decision trees is a useful technique for improving the generalization performance in decision tree induction, acid for trading accuracy for simplicity in other applications. In this paper, a new algorithm called OPT-2 for optimal pruning of decision trees is introduced. The algorithm is based on dynamic programming. In its most basic form, the time and space complexities of OPT-2 are both Theta(nC), where n is the number of test nodes in the initial decision tree, and C is the number of leaves in the target (pruned) decision tree. This is an improvement over the recently published OPT algorithm of Bohanec and Bratko (which is the only known algorithm for optimal decision tree pruning) especially in the case of heavy pruning and when the tests of the given decision tree have many outcomes. If so desired, the space required by OPT-2 can further be reduced by a factor of r at the cost of increasing the execution time by a factor that is bounded above by (r + 1)/2 (this is a considerable overestimate, however). From a practical point of view, OPT-2 enjoys considerable flexibility in various aspects, and is easy to implement.
引用
收藏
页码:347 / 362
页数:16
相关论文
共 7 条
[1]  
BOHANEC M, 1994, MACH LEARN, V15, P223, DOI 10.1023/A:1022685808937
[2]  
Breiman L., 1984, Classification and Regression Trees, DOI DOI 10.2307/2530946
[3]   ON KNAPSACKS, PARTITIONS, AND A NEW DYNAMIC-PROGRAMMING TECHNIQUE FOR TREES [J].
JOHNSON, DS ;
NIEMI, KA .
MATHEMATICS OF OPERATIONS RESEARCH, 1983, 8 (01) :1-14
[4]  
Mingers J., 1989, Machine Learning, V4, P227, DOI 10.1023/A:1022604100933
[5]  
MUGGLETON S, 1989, P 6 INT C MACH LEARN
[6]  
Quinlan J. R., 1986, Machine Learning, V1, P81, DOI 10.1023/A:1022643204877
[7]  
QUINLAN JR, 1989, C4 5 PROGRAMS MACH L