An Incremental Clustering with Attribute Unbalance Considered for Categorical Data

被引:0
作者
Chen, Jize [2 ]
Yang, Zhimin [1 ]
Yin, Jian [2 ]
Yang, Xiaobo [1 ]
Huang, Li [1 ]
机构
[1] Guangzhou Univ Chinese Med, Affiliated Hosp 2, Guangzhou 510120, Peoples R China
[2] Sun Yat Sen Univ, Sch Informat Sci & Technol, Guangzhou 510275, Guangdong, Peoples R China
来源
COMPUTATIONAL INTELLIGENCE AND INTELLIGENT SYSTEMS | 2009年 / 51卷
基金
中国国家自然科学基金;
关键词
incremental clustering; attribute unbalance; categorical data;
D O I
10.1007/978-3-642-04962-0_50
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering analysis is an important technique used in many fields. But traditional clustering algorithms generally deal with numeric data. While clustering categorical data have always attracted researchers' attentions because of their prevalence in real life. This paper analyses limitations of the categorical clustering algorithms proposed. Based on two observations, a new similarity measure is proposed for categorical data which considers the unbalance of attributes. As the data are getting much larger and more dynamic, incremental is an important quality of good clustering algorithms. The clustering algorithm present is an incremental with linear computing complexity. The experiment results indicate that it outperforms other categorical clustering algorithms referred in the paper.
引用
收藏
页码:433 / +
页数:2
相关论文
共 50 条
  • [1] Incremental Clustering for Categorical Data Using Clustering Ensemble
    Li Taoying
    Chne Yan
    Qu Lili
    Mu Xiangwei
    PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 2519 - 2524
  • [2] CLUSTERING CATEGORICAL DATA BASED ON COMBINATIONS OF ATTRIBUTE VALUES
    Do, Hee-Jung
    Kim, Jae Yearn
    INTERNATIONAL JOURNAL OF INNOVATIVE COMPUTING INFORMATION AND CONTROL, 2009, 5 (12A): : 4393 - 4405
  • [3] Incremental learning based multiobjective fuzzy clustering for categorical data
    Saha, Indrajit
    Maulik, Ujjwal
    INFORMATION SCIENCES, 2014, 267 : 35 - 57
  • [4] ICE: Incremental Subspace Clustering of High-Dimensional Categorical Data
    Pang, Ning
    Zhang, Chaowei
    Zhang, Jifu
    Qin, Xiao
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2025, 33 (01) : 87 - 118
  • [5] Categorical data clustering: A correlation-based approach for unsupervised attribute weighting
    Carbonera, Joel Luis
    Abel, Mara
    2014 IEEE 26TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2014, : 259 - 263
  • [6] A new internal clustering validation index for categorical data based on concentration of attribute values
    Fu L.-W.
    Wu S.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2019, 41 (05): : 682 - 693
  • [7] PUMA: Parallel subspace clustering of categorical data using multi-attribute weights
    Pang, Ning
    Zhang, Jifu
    Zhang, Chaowei
    Qin, Xiao
    Cai, Jianghui
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 126 : 233 - 245
  • [8] Automated Attribute Weighting Fuzzy k-Centers Algorithm for Categorical Data Clustering
    Mau, Toan Nguyen
    Huynh, Van-Nam
    MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE (MDAI 2021), 2021, 12898 : 205 - 217
  • [9] Clustering categorical data streams
    He, Zengyou
    Xu, Xiaofei
    Deng, Shengchun
    Huang, Joshua Zhexue
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2011, 11 (04) : 185 - 192
  • [10] Fuzzy Rough Attribute Reduction for Categorical Data
    Wang, Changzhong
    Wang, Yan
    Shao, Mingwen
    Qian, Yuhua
    Chen, Degang
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2020, 28 (05) : 818 - 830