Indiscernibility degree of objects for evaluating simplicity of knowledge in the clustering procedure

被引:2
作者
Hirano, S [1 ]
Tsumoto, S [1 ]
机构
[1] Shimane Med Univ, Sch Med, Dept Med Informat, Izumo, Shimane 6938501, Japan
来源
2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2001年
关键词
D O I
10.1109/ICDM.2001.989521
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a new, rough sets-based clustering method that enables evaluation of simplicity of classification knowledge during the clustering procedure. The method iteratively refines equivalence relations so that they become more simple set of relations that give adequately coarse classification to the objects. At each step of iteration, importance of the equivalence relation is evaluated on the basis of the newly introduced measure, indiscernibility degree. An indiscernibility degree is defined as a ratio of equivalence relations that classify the two objects into the same equivalence class. If an equivalence relation has ability to discern the two objects that have high indiscernibility degree, it is considered to perform too fine classification and then modified to regard them as indiscernible objects. The refinement is repeated decreasing the threshold level of indiscernibility degree, and finally simple clusters can be obtained. Experimental results on the artificial data showed that iterative refinement of equivalence relation lead to successful generation of coarse clusters that can be represented by simple knowledge.
引用
收藏
页码:211 / 217
页数:7
相关论文
共 10 条
[1]  
Anderberg M. R., 1973, CLUSTER ANAL APPL, DOI DOI 10.1016/C2013-0-06161-0
[2]  
Bezdek J. C., 1981, Pattern recognition with fuzzy objective function algorithms
[3]  
Guha S., 1998, SIGMOD Record, V27, P73, DOI 10.1145/276305.276312
[4]   ROCK: A robust clustering algorithm for categorical attributes [J].
Guha, S ;
Rastogi, R ;
Shim, K .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :512-521
[5]  
NEYMAN J, 1958, J ROY STAT SOC B, V20, P1
[6]  
Pawlak Z, 1991, Rough sets: Theoretical aspects of reasoning about data, V9, DOI DOI 10.1007/978-94-011-3534-4
[7]   Cyclic allocation of two-dimensional data [J].
Prabhakar, S ;
Abdel-Ghaffar, K ;
Agrawal, D ;
El Abbadi, A .
14TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1998, :94-101
[8]   K-MEANS-TYPE ALGORITHMS - A GENERALIZED CONVERGENCE THEOREM AND CHARACTERIZATION OF LOCAL OPTIMALITY [J].
SELIM, SZ ;
ISMAIL, MA .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1984, 6 (01) :81-87
[9]   Automated discovery of positive and negative knowledge in clinical databases [J].
Tsumoto, S .
IEEE ENGINEERING IN MEDICINE AND BIOLOGY MAGAZINE, 2000, 19 (04) :56-62
[10]  
Zhang T, 1996, ACM SIGMOD RECORD, V25, P103, DOI [/10.1145/235968.233324, DOI 10.1145/235968.233324, 10.1145/235968.233324]