A study on Two-Stage Mixed Attribute Data Clustering Based on Density Peaks

被引:0
作者
Liu, Shihua [1 ]
Zhang, Hao [1 ]
Liu, Xianghua [1 ]
机构
[1] Wenzhou Polytech, Dept Informat Technol, Wenzhou, Peoples R China
基金
浙江省自然科学基金;
关键词
Mixed data clustering; density peaks; k-prototypes algorithm; validity index; ALGORITHM;
D O I
10.34028/iajit/18/5/2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Two-stage clustering framework and a clustering algorithm for mixed attribute data based on density peaks and Goodall distance are proposed. Firstly, the subset of numerical attributes of the dataset is clustered, and then the result is mapped into one-dimensional categorical attribute and added to the subset of categorical attribute data. Finally, the new dataset is clustered by the density peaks clustering algorithm to obtain the final result. Experiments on three commonly used UCI datasets show that this algorithm can effectively realize mixed attribute clustering and produce better clustering results than the traditional K-prototypes algorithm do. The clustering accuracy on the Acute, Heart and Credit datasets are 17%, 24%, and 21% higher on average than that of the K-prototypes, respectively.
引用
收藏
页码:634 / 643
页数:10
相关论文
共 50 条
  • [21] Superpixel Segmentation Based on Clustering by Finding Density Peaks
    Zhang Z.-L.
    Li A.-H.
    Li C.-W.
    Jisuanji Xuebao/Chinese Journal of Computers, 2020, 43 (01): : 1 - 15
  • [22] Clustering based on local density peaks and graph cut
    Long, Zhiguo
    Gao, Yang
    Meng, Hua
    Yao, Yuqin
    Li, Tianrui
    INFORMATION SCIENCES, 2022, 600 : 263 - 286
  • [23] Cosine kernel based density peaks clustering algorithm
    Wang, Jiayuan
    Lv, Li
    Wu, Runxiu
    Fan, Tanghuai
    Lee, Ivan
    INTERNATIONAL JOURNAL OF COMPUTING SCIENCE AND MATHEMATICS, 2020, 12 (01) : 1 - 20
  • [24] Fast Clustering by Affinity Propagation Based on Density Peaks
    Li, Yang
    Guo, Chonghui
    Sun, Leilei
    IEEE ACCESS, 2020, 8 : 138884 - 138897
  • [25] Two-stage pruning method for gram-based categorical sequence clustering
    Yuan, Liang
    Wang, Wenjian
    Chen, Lifei
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (04) : 631 - 640
  • [26] Accelerated Two-Stage Particle Swarm Optimization for Clustering Not-Well-Separated Data
    Xu, Xiangping
    Li, Jun
    Zhou, MengChu
    Xu, Jun
    Cao, Jinde
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (11): : 4212 - 4223
  • [27] An Adaptive Clustering Algorithm Based on Local-Density Peaks for Imbalanced Data Without Parameters
    Tong, Wuning
    Wang, Yuping
    Liu, Delong
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (04) : 3419 - 3432
  • [28] Fat node leading tree for data stream clustering with density peaks
    Xu, Ji
    Wang, Guoyin
    Li, Tianrui
    Deng, Weihui
    Gou, Guanglei
    KNOWLEDGE-BASED SYSTEMS, 2017, 120 : 99 - 117
  • [29] Efficient Distributed Density Peaks for Clustering Large Data Sets in MapReduce
    Zhang, Yanfeng
    Chen, Shimin
    Yu, Ge
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (12) : 3218 - 3230
  • [30] Density peaks clustering algorithm with nearest neighbor optimization for data with uneven density distribution
    Chen W.-C.
    Zhao J.
    Xiao R.-B.
    Wang H.
    Cui Z.-H.
    Kongzhi yu Juece/Control and Decision, 2024, 39 (03): : 919 - 928