A study on Two-Stage Mixed Attribute Data Clustering Based on Density Peaks

被引:0
作者
Liu, Shihua [1 ]
Zhang, Hao [1 ]
Liu, Xianghua [1 ]
机构
[1] Wenzhou Polytech, Dept Informat Technol, Wenzhou, Peoples R China
基金
浙江省自然科学基金;
关键词
Mixed data clustering; density peaks; k-prototypes algorithm; validity index; ALGORITHM;
D O I
10.34028/iajit/18/5/2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Two-stage clustering framework and a clustering algorithm for mixed attribute data based on density peaks and Goodall distance are proposed. Firstly, the subset of numerical attributes of the dataset is clustered, and then the result is mapped into one-dimensional categorical attribute and added to the subset of categorical attribute data. Finally, the new dataset is clustered by the density peaks clustering algorithm to obtain the final result. Experiments on three commonly used UCI datasets show that this algorithm can effectively realize mixed attribute clustering and produce better clustering results than the traditional K-prototypes algorithm do. The clustering accuracy on the Acute, Heart and Credit datasets are 17%, 24%, and 21% higher on average than that of the K-prototypes, respectively.
引用
收藏
页码:634 / 643
页数:10
相关论文
共 50 条
  • [1] Adaptive Mixed-Attribute Data Clustering Method Based on Density Peaks
    Liu, Shihua
    COMPLEXITY, 2022, 2022
  • [2] ATSDPC: Adaptive two-stage density peaks clustering with hybrid distance based on dispersion coefficient
    Han, Shengqiang
    Zhang, Xue
    Liu, Xiyu
    Zheng, Yuyan
    Qu, Jianhua
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 282
  • [3] Clustering Mixed Data Based on Density Peaks and Stacked Denoising Autoencoders
    Duan, Baobin
    Han, Lixin
    Gou, Zhinan
    Yang, Yi
    Chen, Shuangshuang
    SYMMETRY-BASEL, 2019, 11 (02):
  • [4] A novel density peaks clustering algorithm for mixed data
    Du, Mingjing
    Ding, Shifei
    Xue, Yu
    PATTERN RECOGNITION LETTERS, 2017, 97 : 46 - 53
  • [5] An improved density peaks method for data clustering
    Lotfi, Abdulrahman
    Seyedi, Seyed Amjad
    Moradi, Parham
    2016 6TH INTERNATIONAL CONFERENCE ON COMPUTER AND KNOWLEDGE ENGINEERING (ICCKE), 2016, : 263 - 268
  • [6] Two-Stage Traffic Clustering Based on HNSW
    Zhang, Xu
    Niu, Xinzheng
    Fournier-Viger, Philippe
    Wang, Bing
    ADVANCES AND TRENDS IN ARTIFICIAL INTELLIGENCE: THEORY AND PRACTICES IN ARTIFICIAL INTELLIGENCE, 2022, 13343 : 609 - 620
  • [7] PSOHS: an efficient two-stage approach for data clustering
    Hatamlou, Abdolreza
    Hatamlou, Masoumeh
    MEMETIC COMPUTING, 2013, 5 (02) : 155 - 161
  • [8] TMsDP: two-stage density peak clustering based on multi-strategy optimization
    Ma, Jie
    Hao, Zhiyuan
    Hu, Mo
    DATA TECHNOLOGIES AND APPLICATIONS, 2022, 58 (03) : 1 - 27
  • [9] Two-Stage Sparse Representation Clustering for Dynamic Data Streams
    Chen, Jie
    Wang, Zhu
    Yang, Shengxiang
    Mao, Hua
    IEEE TRANSACTIONS ON CYBERNETICS, 2023, 53 (10) : 6408 - 6420
  • [10] Clustering ensemble based on density peaks
    Chu R.-H.
    Wang H.-J.
    Yang Y.
    Li T.-R.
    Wang, Hong-Jun (wanghongjun@swjtu.edu.cn), 1600, Science Press (42): : 1401 - 1412