Half-global discretization algorithm based on rough set theory

被引:0
作者
Tan Xu Chen Yingwu School of Information Systems Management National Univ of Defense Technology Changsha P R China [410073 ]
机构
关键词
half-global discretization; continuous condition attributes; correlation coefficient; rough entropy; sta-; bility; rough set theory;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is being widely studied how to extract knowledge from a decision table based on rough set theory. The novel problem is how to discretize a decision table having continuous attribute. In order to obtain more reasonable discretization results, a discretization algorithm is proposed, which arranges half-global discretization based on the correlational coefficient of each continuous attribute while considering the uniqueness of rough set theory. When choosing heuristic information, stability is combined with rough entropy. In terms of stability, the possibility of classifying objects belonging to certain sub-interval of a given attribute into neighbor sub-intervals is minimized. By doing this, rational discrete intervals can be determined. Rough entropy is employed to decide the optimal cut-points while guaranteeing the consistency of the decision table after discretization. Thought of this algorithm is elaborated through Iris data and then some experiments by comparing outcomes of four discritized datasets are also given, which are calculated by the proposed algorithm and four other typical algorithms for discritization respectively. After that, classification rules are deduced and summarized through rough set based classifiers. Results show that the proposed discretization algorithm is able to generate optimal classification accuracy while minimizing the number of discrete intervals. It displays superiority especially when dealing with a decision table having a large attribute number.
引用
收藏
页码:339 / 347
页数:9
相关论文
共 3 条
[1]  
Discretization: An Enabling Technique[J] . Huan Liu,Farhad Hussain,Chew Lim Tan,Manoranjan Dash.Data Mining and Knowledge Discovery . 2002 (4)
[2]   Global discretization of continuous attributes as preprocessing for machine learning [J].
Chmielewski, MR ;
GrzymalaBusse, JW .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 1996, 15 (04) :319-331
[3]  
On Estimation of a Probability Density Function and Mode[J] . Emanuel Parzen.The Annals of Mathematical Statistics . 1962 (3)