Research on dynamic cost-sensitive decision tree for mining uncertain data based on the genetic algorithm

被引:0
作者
Huang, Yuwen [1 ,2 ]
机构
[1] Department of Computer and Information Engineering, Heze University, Heze, Shandong
[2] Key Laboratory of computer Information Processing, Heze University, Heze, Shandong
来源
International Journal of Database Theory and Application | 2014年 / 7卷 / 05期
关键词
Decision tree; Dynamic cost-sensitive; Genetic algorithm; Uncertain data;
D O I
10.14257/ijdta.2014.7.5.15
中图分类号
学科分类号
摘要
The existing classifiers for uncertain data don't consider the dynamic cost, so this paper proposes the classification approach of the dynamic cost-sensitive decision tree for uncertain data based on the genetic algorithm (GDCDTU), which overcomes the limitations of the stationary cost, and searches automatically the suitable cost space of every sub datasets. Firstly, this paper gives the dynamic cost-sensitive learning thought, and disposes the continuous and discrete attributes for uncertain data by the probabilistic cardinality. Secondly, we give the selection methods for the splitting attributes and the construction process for cost-sensitive decision tree, and the interval number for describing dynamic cost is coded by its centre and radius. At last, the dynamic cost-sensitive decision tree for uncertain data is structured, which uses the genetic algorithm as the optimal misclassification cost searching way, and the optimum cost is got by the hybridization, the mutation, the selection. The experiments using both artificial and real data sets show that, compared to the other decision tree classification algorithms for uncertain data, GDCDTU has higher classification accuracy and performance, and the total expenditure is lower. © 2014 SERSC.
引用
收藏
相关论文
共 20 条
[1]  
Qin B., Xia Y.N., Wang S., Du X.Y., A novel Bayesian classification for uncertain data, Knowledge-Based Systems, 24, 8, pp. 1151-1158, (2011)
[2]  
Liu Z.G., Pan Q., Dezert J., Mercier G., Credal classification rule for uncertain data based on belief function, Pattern Recognition, 47, 7, pp. 2532-2541, (2014)
[3]  
Bounhas M., Hamed M.G., Prade H., Serrurier M., Mellouli K., Naive possibilistic classifiers for imprecise or uncertain numerical data, Fuzzy Sets and Systems, 239, pp. 137-156, (2014)
[4]  
Sun Y.J., Yuan Y., Wang G.R., Extreme learning machine for classification over uncertain data, Neurocomputing, 128, pp. 500-506, (2014)
[5]  
Liu Y.H., Wang C.S., Constrained frequent pattern mining on univariate uncertain data, Journal of Systems and Software, 86, pp. 759-778, (2013)
[6]  
Liu Z.G., Pan Q., Dezert J., Classification of uncertain and imprecise data based on evidence theory, Neurocomputing, 133, pp. 459-470, (2014)
[7]  
Qin B., Xia Y.N., Li F., DTU: A Decision Tree for Uncertain Dat, Proc. of the 13th Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 4-15, (2009)
[8]  
Tsang S., Kao B., Yip K.Y., Ho W.S., Et al., Decision Trees for Uncertain Data, In Proc. Of the 25th International Conference on Data Engineering (ICDE'09), pp. 441-444, (2009)
[9]  
Qin B., Xia Y., Prabhakar S., Tu Y., A Rule-Based Classification Algorithm for Uncertain Data, Proc. of the 1st IEEE workshop on Management and Mining of Uncertain Data (MOUND'09), pp. 1633-1640, (2009)
[10]  
Liang C.Q., Zhang Y., Shi P., Hu Z.G., Learning very fast decision tree from uncertain data streams with positive and unlabeled samples, Information Sciences, 213, pp. 50-67, (2012)