Clustering Categorical Data Using Hierarchies (CLUCDUH)

被引:0
作者
Silahtaroglu, Gökhan [1 ]
机构
[1] Beykent University, Department of Mathematics and Computing, Istanbul 34900, Turkey
来源
World Academy of Science, Engineering and Technology | 2009年 / 56卷
关键词
Clustering; -; Gini; Pruning; Split; Tree;
D O I
暂无
中图分类号
学科分类号
摘要
Clustering large populations is an important problem when the data contain noise and different shapes. A good clustering algorithm or approach should be efficient enough to detect clusters sensitively. Besides space complexity, time complexity also gains importance as the size grows. Using hierarchies we developed a new algorithm to split attributes according to the values they have and choosing the dimension for splitting so as to divide the database roughly into equal parts as much as possible. At each node we calculate some certain descriptive statistical features of the data which reside and by pruning we generate the natural clusters with a complexity of O(n).
引用
收藏
页码:334 / 339
相关论文
共 50 条
  • [31] A Support Based Initialization Algorithm for Categorical Data Clustering
    Kumar, Ajay
    Kumar, Shishir
    [J]. JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2018, 11 (02) : 53 - 67
  • [32] Clustering categorical data: an approach based on dynamical systems
    Gibson, D
    Kleinberg, J
    Raghavan, P
    [J]. VLDB JOURNAL, 2000, 8 (3-4) : 222 - 236
  • [33] Multiobjective clustering algorithm with fuzzy centroids for categorical data
    Zhou Z.
    Zhu S.
    Zhang D.
    [J]. 1600, Science Press (53): : 2594 - 2606
  • [34] A Framework for Clustering Massive Text and Categorical Data Streams
    Aggarwal, Charu C.
    Yu, Philip S.
    [J]. PROCEEDINGS OF THE SIXTH SIAM INTERNATIONAL CONFERENCE ON DATA MINING, 2006, : 479 - 483
  • [35] Learning-Based Dissimilarity for Clustering Categorical Data
    Rivera Rios, Edgar Jacob
    Angel Medina-Perez, Miguel
    Lazo-Cortes, Manuel S.
    Monroy, Raul
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [36] A fair-multicluster approach to clustering of categorical data
    Santos-Mangudo, Carlos
    Heras, Antonio J.
    [J]. CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH, 2023, 31 (02) : 583 - 604
  • [37] Categorical Data Clustering with Automatic Selection of Cluster Number
    Liao, Hai-Yong
    Ng, Michael K.
    [J]. FUZZY INFORMATION AND ENGINEERING, 2009, 1 (01) : 5 - 25
  • [38] A k-populations algorithm for clustering categorical data
    Kim, DW
    Lee, K
    Lee, D
    Lee, KH
    [J]. PATTERN RECOGNITION, 2005, 38 (07) : 1131 - 1134
  • [39] Categorical data clustering: What similarity measure to recommend?
    dos Santos, Tiago R. L.
    Zarate, Luis E.
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (03) : 1247 - 1260
  • [40] Clustering High-Dimensional Noisy Categorical Data
    Tian, Zhiyi
    Xu, Jiaming
    Tang, Jen
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024, 119 (548) : 3008 - 3019