A cost sensitive decision tree algorithm with two adaptive mechanisms

被引:30
作者
Li, Xiangju [1 ]
Zhao, Hong [1 ]
Zhu, William [1 ]
机构
[1] Minnan Normal Univ, Lab Granular Comp, Zhangzhou 363000, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptive mechanisms; Cost sensitive; Decision tree; Granular computing; ROUGH SETS; ATTRIBUTE REDUCTION; FEATURE-SELECTION; CLASSIFICATION; APPROXIMATIONS;
D O I
10.1016/j.knosys.2015.08.012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Decision trees have been widely used in data mining and machine learning as a comprehensible knowledge representation. Minimal cost decision tree construction plays a crucial role in cost sensitive learning. Recently, many algorithms have been developed to tackle this problem. These algorithms choose an appropriate cut point of a numeric attribute by computing all possible cut points and assign a node through test all attributes. Therefore, the efficiency of these algorithms for large data sets is often unsatisfactory. To solve this issue, in this paper we propose a cost sensitive decision tree algorithm with two adaptive mechanisms to learn cost sensitive decision trees from training data sets based on C4.5 algorithm. The two adaptive mechanisms play an important role in cost sensitive decision tree construction. The first mechanism, adaptive selecting the cut point (ASCP) mechanism, selects the cut point adaptively to build a classifier rather than calculates each possible cut point of an attribute. It improves the efficiency of evaluating numeric attributes for cut point selection significantly. The second mechanism, adaptive removing attribute (ARA) mechanism, removes some redundant attributes in the process of selecting node. The effectiveness of the proposed algorithm is demonstrated on fourteen UCI data sets with representative test cost Normal distribution. Compared with the CS-C4.5 algorithm, the proposed algorithm significantly increases efficiency. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:24 / 33
页数:10
相关论文
共 50 条
[1]   A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping [J].
Althuwaynee, Omar F. ;
Pradhan, Biswajeet ;
Park, Hyuck-Jin ;
Lee, Jung Hyun .
LANDSLIDES, 2014, 11 (06) :1063-1078
[2]  
[Anonymous], P WORKSH COST SENS L
[3]  
[Anonymous], 1999, P 5 ACM SIGKDD INT C, DOI DOI 10.1145/312129.312220
[4]  
[Anonymous], 1989, NEURAL NETWORKS
[5]  
Blake C., UCI Repository of machine learning databases
[6]   A similarity metric designed to speed up, using hardware, the recommender systems k-nearest neighbors algorithm [J].
Bobadilla, Jesus ;
Ortega, Fernando ;
Hernando, Antonio ;
Glez-de-Rivera, Guillermo .
KNOWLEDGE-BASED SYSTEMS, 2013, 51 :27-34
[7]   Finding rough set reducts with fish swarm algorithm [J].
Chen, Yumin ;
Zhu, Qingxin ;
Xu, Huarong .
KNOWLEDGE-BASED SYSTEMS, 2015, 81 :22-29
[8]  
Claesen M, 2014, J MACH LEARN RES, V15, P141
[9]   Uncertainty measurement for interval-valued decision systems based on extended conditional entropy [J].
Dai, Jianhua ;
Wang, Wentao ;
Xu, Qing ;
Tian, Haowei .
KNOWLEDGE-BASED SYSTEMS, 2012, 27 :443-450
[10]  
Davis JV, 2006, LECT NOTES COMPUT SC, V4212, P622