Improved C4.5 Algorithm for the Analysis of Sales

被引:4
作者
Cao, Rong [1 ]
Xu, Lizhen [1 ]
机构
[1] Southeast Univ, Sch Engn & Comp Sci, Nanjing 211189, Peoples R China
来源
2009 SIXTH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE, PROCEEDINGS | 2009年
关键词
decision tree; algorithm C4.5; the rate of information gain; large data sets;
D O I
10.1109/WISA.2009.36
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A decision tree is an important means of data mining and inductive learning, which is usually used to form classifiers and prediction models. C4.5 is one of the most classic classification algorithms on data mining, but when it is used in mass calculations, the efficiency is very low. In this paper, the rule of C4.5 is improved by the use of L'Hospital Rule, which simplifies the calculation process and improves the efficiency of decision-making algorithm. When calculating the rate of information gain, the similar principle is used, which improves the algorithm a lot. And the application at the end of the paper shows that the improved algorithm is efficient, which is more suitable for the application of large amounts of data, and its efficiency has been greatly improved in line with the practical application.
引用
收藏
页码:173 / 176
页数:4
相关论文
共 8 条
[1]  
[Anonymous], 1986, MACHINE LEARNING
[2]  
LIU B, 1999, P 4 INT C KNOWL DISC, P80
[3]  
Mehta M., 1996, P 5 INT C EXTENDING, P18
[4]  
Quinlan J. R., 1992, C4.5: Programs for machine learning
[5]  
ROSSQUINLAN J, 1996, ARTIF INTELL, V4, P77290
[6]  
*U CA DEP INF COMP, 1998, UCIREPOSITORY MACH E
[7]   Scalable mining for classification rules in relational databases [J].
Wang, M ;
Iyer, B ;
Vitter, JS .
IDEAS 98 - INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1998, :58-67
[8]  
ZHAO W, 2003, BASED RES DECISION T, P25