Building a cost-constrained decision tree with multiple condition attributes

被引:13
作者
Chen, Yen-Liang [1 ]
Wu, Chia-Chi [1 ]
Tang, Kwei [2 ]
机构
[1] Natl Cent Univ, Dept Informat Management, Jhongli 320, Taiwan
[2] Purdue Univ, Krannert Sch Management, W Lafayette, IN 47907 USA
关键词
Data mining; Decision analysis; Cost-sensitive learning; Classification; Decision tree; SENSITIVE CLASSIFICATION; KNOWLEDGE;
D O I
10.1016/j.ins.2008.11.032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Costs are often an important part of the classification process. Cost factors have been taken into consideration in many previous studies regarding decision tree models. In this study, we also consider a cost-sensitive decision tree construction problem. We assume that there are test costs that must be paid to obtain the values of the decision attribute and that a record must be classified without exceeding the spending cost threshold. Unlike previous studies, however, in which records were classified with only a single condition attribute, in this study, we are able to simultaneously classify records with multiple condition attributes. An algorithm is developed to build a cost-constrained decision tree, which allows us to simultaneously classify multiple condition attributes. The experimental results show that our algorithm satisfactorily handles data with multiple condition attributes under different cost constraints. (C) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:967 / 979
页数:13
相关论文
共 19 条
[1]  
[Anonymous], P 11 INT JOINT C ART
[2]   Building multi-way decision trees with numerical attributes [J].
Berzal, F ;
Cubero, JC ;
Marín, N ;
Sánchez, D .
INFORMATION SCIENCES, 2004, 165 (1-2) :73-90
[3]   SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation [J].
Blewitt, Marnie E. ;
Gendrel, Anne-Valerie ;
Pang, Zhenyi ;
Sparrow, Duncan B. ;
Whitelaw, Nadia ;
Craig, Jeffrey M. ;
Apedaile, Anwyn ;
Hilton, Douglas J. ;
Dunwoodie, Sally L. ;
Brockdorff, Neil ;
Kay, Graham F. ;
Whitelaw, Emma .
NATURE GENETICS, 2008, 40 (05) :663-669
[4]  
Domingos P., 1999, P ACM SIGKDD INT C K, P155, DOI DOI 10.1145/312129.312220
[5]  
Elkan C., 2001, P INT JOINT C ART IN, P973, DOI DOI 10.5555/1642194.1642224
[6]  
Han J., 2006, Data Mining: Concepts and Techniques, Vsecond
[7]  
KAI MT, 1998, PRINCIPLES DATA MINI, P23
[8]   Learning to classify e-mail [J].
Koprinska, Irena ;
Poon, Josiah ;
Clark, James ;
Chan, Jason .
INFORMATION SCIENCES, 2007, 177 (10) :2167-2187
[9]  
Ling C.X., 2004, P 21 INT C MACH LEAR, P69, DOI DOI 10.1109/TSMCB.2008.2007853
[10]   Test strategies for cost-sensitive decision trees [J].
Ling, Charles X. ;
Sheng, Victor S. ;
Yang, Qiang .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (08) :1055-1067