Moving towards efficient decision tree construction

被引:51
作者
Chandra, B. [1 ]
Varghese, P. Paul [1 ]
机构
[1] Indian Inst Technol, Dept Math, New Delhi 110016, India
关键词
Decision trees; Gini Index; Gain Ratio; Split measure; ATTRIBUTES;
D O I
10.1016/j.ins.2008.12.006
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Motivated by the desire to construct compact (in terms of expected length to be traversed to reach a decision) decision trees, we propose a new node splitting measure for decision tree construction. We show that the proposed measure is convex and cumulative and utilize this in the construction of decision trees for classification. Results obtained from several datasets from the UCI repository show that the proposed measure results in decision trees that are more compact with classification accuracy that is comparable to that obtained using popular node splitting measures such as Gain Ratio and the Gini Index. (C) 2008 Published by Elsevier Inc.
引用
收藏
页码:1059 / 1069
页数:11
相关论文
共 33 条
[1]  
ALSABTI K, 1998, CLOUDS DECISION TREE, P2
[2]  
[Anonymous], 1993, Proceedings of the 13th International Joint Conference on Artificial Intelligence
[3]  
[Anonymous], 1983, MACHINE LEARNING ART
[4]  
[Anonymous], 1997, MACHINE LEARNING, MCGRAW-HILL SCIENCE/ENGINEERING/MATH
[5]  
Breiman, 1984, OLSHEN STONE CLASSIF, DOI [10.2307/2530946, DOI 10.2307/2530946]
[6]   Technical note: Some properties of splitting criteria [J].
Breiman, L .
MACHINE LEARNING, 1996, 24 (01) :41-47
[7]   On improving efficency of SLIQ decision tree algorithm [J].
Chandra, B. ;
Varghese, P. Paul .
2007 IEEE INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-6, 2007, :66-71
[8]  
Chandra B, 2002, WISE 2002: PROCEEDINGS OF THE THIRD INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS ENGINEERING (WORKSHOPS), P160
[9]  
CODRINGTON C, 1997, QUALITATIVE BEHAV IM
[10]   Studies on incidence pattern recognition based on information entropy [J].
Ding, SF ;
Shi, ZZ .
JOURNAL OF INFORMATION SCIENCE, 2005, 31 (06) :497-502