A k-populations algorithm for clustering categorical data

被引:22
|
作者
Kim, DW [1 ]
Lee, K
Lee, D
Lee, KH
机构
[1] Korea Adv Inst Sci & Technol, Dept BioSyst, Taejon 305701, South Korea
[2] Korea Adv Inst Sci & Technol, Adv Informat Technol Res Ctr, Taejon 305701, South Korea
[3] Korea Adv Inst Sci & Technol, Dept Elect Engn & Comp Sci, Taejon 305701, South Korea
关键词
clustering; categorical data; hierarchical algorithm; k-modes algorithm; fuzzy k-modes algorithm;
D O I
10.1016/j.patcog.2004.11.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the conventional k-modes-type algorithms for clustering categorical data are extended by representing the clusters of categorical data with k-populations instead of the hard-type centroids used in the conventional algorithms. Use of a population-based centroid representation makes it possible to preserve the uncertainty inherent in data sets as long as possible before actual decisions are made. The k-populations algorithm was found to give markedly better clustering results through various experiments. (c) 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1131 / 1134
页数:4
相关论文
共 50 条
  • [41] Fuzzy clustering of categorical data using fuzzy centroids
    Kim, DW
    Lee, KH
    Lee, D
    PATTERN RECOGNITION LETTERS, 2004, 25 (11) : 1263 - 1271
  • [42] Efficiency Based Categorical Data Clustering
    Kalaivani, K.
    Raghavendra, A. P. V.
    2012 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (ICCIC), 2012, : 550 - 553
  • [43] Space Structure and Clustering of Categorical Data
    Qian, Yuhua
    Li, Feijiang
    Liang, Jiye
    Liu, Bing
    Dang, Chuangyin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2016, 27 (10) : 2047 - 2059
  • [44] Clustering Categorical Data Based on Representatives
    Aranganayagi, S.
    Thangavel, K.
    THIRD 2008 INTERNATIONAL CONFERENCE ON CONVERGENCE AND HYBRID INFORMATION TECHNOLOGY, VOL 1, PROCEEDINGS, 2008, : 599 - +
  • [45] A Categorical Data Clustering Algorithm and Its Efficient Parallel Implementation
    Ding, Xiangwu
    Tan, Jia
    Wang, Mei
    PROCEEDINGS OF 2016 5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), 2016, : 224 - 228
  • [46] A fuzzy SV-k-modes algorithm for clustering categorical data with set-valued attributes
    Cao, Fuyuan
    Huang, Joshua Zhexue
    Liang, Jiye
    APPLIED MATHEMATICS AND COMPUTATION, 2017, 295 : 1 - 15
  • [47] A modified Fuzzy k-Partition based on indiscernibility relation for categorical data clustering
    Yanto, Iwan Tri Riyadi
    Ismail, Maizatul Akmar
    Herawan, Tutut
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 53 : 41 - 52
  • [48] High Dimensional Data Clustering Algorithm Based on Sparse Feature Vector for Categorical Attributes
    Wu, Sen
    Wei, Guiying
    PROCEEDINGS OF 2010 INTERNATIONAL CONFERENCE ON LOGISTICS SYSTEMS AND INTELLIGENT MANAGEMENT, VOLS 1-3, 2010, : 973 - 976
  • [49] G-ANMI: A mutual information based genetic clustering algorithm for categorical data
    Deng, Shengchun
    He, Zengyou
    Xu, Xiaofei
    KNOWLEDGE-BASED SYSTEMS, 2010, 23 (02) : 144 - 149
  • [50] An LSH-based k-representatives clustering method for large categorical data
    Mau, Toan Nguyen
    Huynh, Van-Nam
    NEUROCOMPUTING, 2021, 463 : 29 - 44