Variable precision rough set based decision tree classifier

被引:10
作者
Yi Weiguo [1 ,2 ]
Lu Mingyu [1 ]
Liu Zhi [1 ]
机构
[1] Dalian Maritime Univ, Dalian, Peoples R China
[2] Dalian Jiaotong Univ, Software Inst, Dalian, Peoples R China
基金
中国国家自然科学基金;
关键词
Decision tree; variable precision rough set; weighted roughness; complexity; match; ATTRIBUTES;
D O I
10.3233/IFS-2012-0496
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper analyzes the existing decision tree classification algorithms and finds that these algorithms based on variable precision rough set (VPRS) have better classification accuracy and can tolerate the noise data. But when constructing decision tree based on variable precision rough set, these algorithms have the following shortcomings: the choice of attribute is difficult and the decision tree classification accuracy is not high. Therefore, this paper proposes a new variable precision rough set based decision tree algorithm (IVPRSDT). This algorithm uses a new standard of attribute selection which considers comprehensively the classification accuracy and number of attribute values, that is, weighted roughness and complexity. At the same time support and confidence are introduced in the conditions of the corresponding node to stop splitting, and they can improve the algorithm's generalization ability. To reduce the impact of noise data and missing values, IVPRSDT uses the label predicted method based on match. The comparing experiments on twelve different data sets from the UCI Machine Learning Repository show that IVPRSDT can effectively improve the classification accuracy.
引用
收藏
页码:61 / 70
页数:10
相关论文
共 26 条
[1]   DATABASE MINING - A PERFORMANCE PERSPECTIVE [J].
AGRAWAL, R ;
IMIELINSKI, T ;
SWAMI, A .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1993, 5 (06) :914-925
[2]  
Alsabti K., 1998, Proceedings Fourth International Conference on Knowledge Discovery and Data Mining, P2
[3]  
[Anonymous], 1983, MACHINE LEARNING ART
[4]   Constructing a decision tree from data with hierarchical class labels [J].
Chen, Yen-Liang ;
Hu, Hsiao-Wei ;
Tang, Kwei .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (03) :4838-4847
[5]   MMDT: a multi-valued and multi-labeled decision tree classifier for data mining [J].
Chou, SC ;
Hsu, CL .
EXPERT SYSTEMS WITH APPLICATIONS, 2005, 28 (04) :799-812
[6]   General and efficient multisplitting of numerical attributes [J].
Elomaa, T ;
Rousu, J .
MACHINE LEARNING, 1999, 36 (03) :201-244
[7]  
Elomaa T., 1997, P 6 SCAND C ART INT
[8]  
FAYYAD UM, 1992, MACH LEARN, V8, P87, DOI 10.1023/A:1022638503176
[9]  
FAYYAD UM, 1993, IJCAI-93, VOLS 1 AND 2, P1022
[10]   Decision tree induction using a fast splitting attribute selection for large datasets [J].
Franco-Arcega, A. ;
Carrasco-Ochoa, J. A. ;
Sanchez-Diaz, G. ;
Fco Martinez-Trinidad, J. .
EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (11) :14290-14300