A study on Two-Stage Mixed Attribute Data Clustering Based on Density Peaks

被引:0
作者
Liu, Shihua [1 ]
Zhang, Hao [1 ]
Liu, Xianghua [1 ]
机构
[1] Wenzhou Polytech, Dept Informat Technol, Wenzhou, Peoples R China
基金
浙江省自然科学基金;
关键词
Mixed data clustering; density peaks; k-prototypes algorithm; validity index; ALGORITHM;
D O I
10.34028/iajit/18/5/2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Two-stage clustering framework and a clustering algorithm for mixed attribute data based on density peaks and Goodall distance are proposed. Firstly, the subset of numerical attributes of the dataset is clustered, and then the result is mapped into one-dimensional categorical attribute and added to the subset of categorical attribute data. Finally, the new dataset is clustered by the density peaks clustering algorithm to obtain the final result. Experiments on three commonly used UCI datasets show that this algorithm can effectively realize mixed attribute clustering and produce better clustering results than the traditional K-prototypes algorithm do. The clustering accuracy on the Acute, Heart and Credit datasets are 17%, 24%, and 21% higher on average than that of the K-prototypes, respectively.
引用
收藏
页码:634 / 643
页数:10
相关论文
共 50 条
  • [31] Parallel Implementation of Density Peaks Clustering Algorithm Based on Spark
    Liu, Rui
    Li, Xiaoge
    Du, Liping
    Zhi, Shuting
    Wei, Mian
    ADVANCES IN INFORMATION AND COMMUNICATION TECHNOLOGY, 2017, 107 : 442 - 447
  • [32] Density Peaks Clustering Based on Improved RNA Genetic Algorithm
    Ren, Liyan
    Zang, Wenke
    HUMAN CENTERED COMPUTING, HCC 2017, 2018, 10745 : 28 - 33
  • [33] Density peaks clustering based on circular partition and grid similarity
    Zhao, Jia
    Tang, Jingjing
    Fan, Tanghuai
    Li, Chenming
    Xu, Lizhong
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (07)
  • [34] Density Peaks Clustering Based on Local Minimal Spanning Tree
    Wang, Renmin
    Zhu, Qingsheng
    IEEE ACCESS, 2019, 7 : 108438 - 108446
  • [35] A novel density peaks clustering algorithm based on Hopkins statistic
    Zhang, Ruilin
    Miao, Zhenguo
    Tian, Ye
    Wang, Hongpeng
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 201
  • [36] Density Peaks Clustering Based on Jaccard Similarity and Label Propagation
    Qin, Xiaowei
    Han, Xiaoxia
    Chu, Junwen
    Zhang, Yan
    Xu, Xinying
    Xie, Jun
    Xie, Gang
    COGNITIVE COMPUTATION, 2021, 13 (06) : 1609 - 1626
  • [37] Density peaks clustering based on superior nodes and fuzzy correlation
    Zang, Wenke
    Liu, Xincheng
    Ma, Linlin
    Che, Jing
    Sun, Minghe
    Zhao, Yuzhen
    Liu, Xiyu
    Li, Hui
    INFORMATION SCIENCES, 2024, 672
  • [38] Two-stage distributed generation optimal sizing with clustering-based node selection
    Rotaru, Florina
    Chicco, Gianfranco
    Grigoras, Gheorghe
    Cortina, Gheorghe
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2012, 40 (01) : 120 - 129
  • [39] Reverse-Nearest-Neighbor-Based Clustering by Fast Search and Find of Density Peaks
    Zhang, Chunhao
    Xie, Bin
    Zhang, Yiran
    CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (06) : 1341 - 1354
  • [40] Density Peaks Clustering Algorithm for Large-scale Data Based on Divide-and-Conquer Strategy
    Wang, Yining
    2021 3RD INTERNATIONAL CONFERENCE ON MACHINE LEARNING, BIG DATA AND BUSINESS INTELLIGENCE (MLBDBI 2021), 2021, : 416 - 419