Half-global discretization algorithm based on rough set theory

被引:0
作者
Tan Xu [1 ]
Chen Yingwu [1 ]
机构
[1] Natl Univ Def Technol, Sch Informat Syst & Management, Changsha 410073, Hunan, Peoples R China
关键词
half-global discretization; continuous condition attributes; correlation coefficient; rough entropy; stability; rough set theory;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It is being widely studied how to extract knowledge from a decision table based on rough set theory. The novel problem is how to discretize a decision table having continuous attribute. In order to obtain more reasonable discretization results, a discretization algorithm is proposed, which arranges half-global discretization based on the correlational coefficient of each continuous attribute while considering the uniqueness of rough set theory. When choosing heuristic information, stability is combined with rough entropy. In terms of stability, the possibility of classifying objects belonging to certain sub-interval of a given attribute into neighbor sub-intervals is minimized. By doing this, rational discrete intervals can be determined. Rough entropy is employed to decide the optimal cut-points while guaranteeing the consistency of the decision table after discretization. Thought of this algorithm is elaborated through Iris data and then some experiments by comparing outcomes of four discritized datasets are also given, which are calculated by the proposed algorithm and four other typical algorithms for discritization respectively. After that, classification rules are deduced and summarized through rough set based classifiers. Results show that the proposed discretization algorithm is able to generate optimal classification accuracy while minimizing the number of discrete intervals. It displays superiority especially when dealing with a decision table having a large attribute number.
引用
收藏
页码:339 / 347
页数:9
相关论文
共 14 条
[1]   Stability of continuous value discretisation: an application within rough set theory [J].
Beynon, MJ .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2004, 35 (01) :29-53
[2]   Global discretization of continuous attributes as preprocessing for machine learning [J].
Chmielewski, MR ;
GrzymalaBusse, JW .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 1996, 15 (04) :319-331
[3]  
Hung Son Ngugen, 1997, Foundations of Intelligent Systems. 10th International Symposium, ISMIS '97. Proceedings, P117
[4]  
Kerber Randy., 1992, P 10 NATL C ARTIFICI, P123
[5]   CAIM discretization algorithm [J].
Kurgan, LA ;
Cios, KJ .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2004, 16 (02) :145-153
[6]  
Lin TsauYoung., 2002, DATA MINING ROUGH SE
[7]   Discretization: An enabling technique [J].
Liu, H ;
Hussain, F ;
Tan, CL ;
Dash, M .
DATA MINING AND KNOWLEDGE DISCOVERY, 2002, 6 (04) :393-423
[8]   A discretization algorithm based on a heterogeneity criterion [J].
Liu, XY ;
Wang, HQ .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (09) :1166-1173
[9]   ESTIMATION OF A PROBABILITY DENSITY-FUNCTION AND MODE [J].
PARZEN, E .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (03) :1065-&
[10]  
Singh GK, 2007, ICCTA 2007: INTERNATIONAL CONFERENCE ON COMPUTING: THEORY AND APPLICATIONS, PROCEEDINGS, P330