Stream Classification Algorithm Based on Decision Tree

被引:3
作者
Guo, Jinlin [1 ]
Wang, Haoran [1 ]
Li, Xinwei [1 ]
Zhang, Li [2 ]
机构
[1] Natl Univ Def Technol, Coll Syst Engn, Changsha 410000, Peoples R China
[2] Northeastern Univ, Software Coll, Shenyang 110000, Peoples R China
基金
中国国家自然科学基金;
关键词
Decision trees;
D O I
10.1155/2021/3103053
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the rise of many fields such as e-commerce platforms, a large number of stream data has emerged. The incomplete labeling problem and concept drift problem of these data pose a huge challenge to the existing stream data classification methods. In this respect, a dynamic stream data classification algorithm is proposed for the stream data. For the incomplete labeling problem, this method introduces randomization and iterative strategy based on the very fast decision tree VFDT algorithm to design an iterative integration algorithm, and the algorithm uses the previous model classification result as the next model input and implements the voting mechanism for new data classification. At the same time, the window mechanism is used to store data and calculate the data distribution characteristics in the window, then, combined with the calculated result and the predicted amount of data to adjust the size of the sliding window. Experiments show the superiority of the algorithm in classification accuracy. The aim of the study is to compare different algorithms to evaluate whether classification model adapts to the current data environment.
引用
收藏
页数:11
相关论文
共 18 条
[1]  
[Anonymous], 2003, P 9 ACM SIGKDD INT C, DOI 10.1145/956750.956813
[2]  
Bifet A, 2009, LECT NOTES COMPUT SC, V5772, P249, DOI 10.1007/978-3-642-03915-7_22
[3]  
Bifet A, 2007, PROCEEDINGS OF THE SEVENTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, P443
[4]  
Domingos P., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P71, DOI 10.1145/347090.347107
[5]  
Han M., 2016, CHINESE J COMPUTERS, V39, P1514
[7]  
Hulten G., 2001, KDD-2001. Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P97, DOI 10.1145/502512.502529
[8]  
[季一木 Ji Yimu], 2017, [计算机研究与发展, Journal of Computer Research and Development], V54, P1945
[9]  
Li P., 2015, DATA STREAM CLASSIFI, P205
[10]  
Li P., 2016, P IEEE ICDM BARC SPA