An Incremental Clustering with Attribute Unbalance Considered for Categorical Data

被引:0
作者
Chen, Jize [2 ]
Yang, Zhimin [1 ]
Yin, Jian [2 ]
Yang, Xiaobo [1 ]
Huang, Li [1 ]
机构
[1] Guangzhou Univ Chinese Med, Affiliated Hosp 2, Guangzhou 510120, Peoples R China
[2] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510275, Guangdong, Peoples R China
来源
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS | 2009年 / 51卷
基金
中国国家自然科学基金;
关键词
incremental clustering; attribute unbalance; categorical data;
D O I
10.1007/978-3-642-04962-0_50
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering analysis is an important technique used in many fields. But traditional clustering algorithms generally deal with numeric data. While clustering categorical data have always attracted researchers' attentions because of their prevalence in real life. This paper analyses limitations of the categorical clustering algorithms proposed. Based on two observations, a new similarity measure is proposed for categorical data which considers the unbalance of attributes. As the data are getting much larger and more dynamic, incremental is an important quality of good clustering algorithms. The clustering algorithm present is an incremental with linear computing complexity. The experiment results indicate that it outperforms other categorical clustering algorithms referred in the paper.
引用
收藏
页码:433 / +
页数:2
相关论文
共 50 条
  • [21] Weighted Topological Clustering for Categorical Data
    Rogovschi, Nicoleta
    Nadif, Mohamed
    NEURAL INFORMATION PROCESSING, PT I, 2011, 7062 : 599 - +
  • [22] Clustering categorical data in projected spaces
    Bouguessa, Mohamed
    DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (01) : 3 - 38
  • [23] Fuzzy rough clustering for categorical data
    Xu, Shuliang
    Liu, Shenglan
    Zhou, Jian
    Feng, Lin
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (11) : 3213 - 3223
  • [24] Kernel Subspace Clustering Algorithm for Categorical Data
    Xu K.-P.
    Chen L.-F.
    Sun H.-J.
    Wang B.-Z.
    Ruan Jian Xue Bao/Journal of Software, 2020, 31 (11): : 3492 - 3505
  • [25] The performance of objective functions for clustering categorical data
    Xiang, Zhengrong
    Islam, Md Zahidul
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2014, 8863 : 16 - 28
  • [26] Generalized Similarity Measure for Categorical Data Clustering
    Sharma, Shruti
    Singh, Manoj
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 765 - 769
  • [27] EnsCat: clustering of categorical data via ensembling
    Clarke, Bertrand S.
    Amiri, Saeid
    Clarke, Jennifer L.
    BMC BIOINFORMATICS, 2016, 17
  • [28] A CLUSTERING ALGORITHM FOR MIXED NUMERIC AND CATEGORICAL DATA
    Ohn Mar San
    Van-Nam Huynh
    Yoshiteru Nakamori
    JournalofSystemsScienceandComplexity, 2003, (04) : 562 - 571
  • [29] Squeezer: An efficient algorithm for clustering categorical data
    He, ZY
    Xu, XF
    Deng, SC
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2002, 17 (05) : 611 - 624
  • [30] On clustering massive text and categorical data streams
    Aggarwal, Charu C.
    Yu, Philip S.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2010, 24 (02) : 171 - 196