Clustering Categorical Data Using Hierarchies (CLUCDUH)
被引:0
作者:
Silahtaroglu, Gökhan
论文数: 0引用数: 0
h-index: 0
机构:
Beykent University, Department of Mathematics and Computing, Istanbul 34900, TurkeyBeykent University, Department of Mathematics and Computing, Istanbul 34900, Turkey
Silahtaroglu, Gökhan
[1
]
机构:
[1] Beykent University, Department of Mathematics and Computing, Istanbul 34900, Turkey
来源:
World Academy of Science, Engineering and Technology
|
2009年
/
56卷
关键词:
Clustering;
-;
Gini;
Pruning;
Split;
Tree;
D O I:
暂无
中图分类号:
学科分类号:
摘要:
Clustering large populations is an important problem when the data contain noise and different shapes. A good clustering algorithm or approach should be efficient enough to detect clusters sensitively. Besides space complexity, time complexity also gains importance as the size grows. Using hierarchies we developed a new algorithm to split attributes according to the values they have and choosing the dimension for splitting so as to divide the database roughly into equal parts as much as possible. At each node we calculate some certain descriptive statistical features of the data which reside and by pruning we generate the natural clusters with a complexity of O(n).