A study on Two-Stage Mixed Attribute Data Clustering Based on Density Peaks

被引:0
作者
Liu, Shihua [1 ]
Zhang, Hao [1 ]
Liu, Xianghua [1 ]
机构
[1] Wenzhou Polytech, Dept Informat Technol, Wenzhou, Peoples R China
基金
浙江省自然科学基金;
关键词
Mixed data clustering; density peaks; k-prototypes algorithm; validity index; ALGORITHM;
D O I
10.34028/iajit/18/5/2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A Two-stage clustering framework and a clustering algorithm for mixed attribute data based on density peaks and Goodall distance are proposed. Firstly, the subset of numerical attributes of the dataset is clustered, and then the result is mapped into one-dimensional categorical attribute and added to the subset of categorical attribute data. Finally, the new dataset is clustered by the density peaks clustering algorithm to obtain the final result. Experiments on three commonly used UCI datasets show that this algorithm can effectively realize mixed attribute clustering and produce better clustering results than the traditional K-prototypes algorithm do. The clustering accuracy on the Acute, Heart and Credit datasets are 17%, 24%, and 21% higher on average than that of the K-prototypes, respectively.
引用
收藏
页码:634 / 643
页数:10
相关论文
共 50 条
  • [41] Improved Density Peaks Clustering Based on Shared-Neighbors of Local Cores for Manifold Data Sets
    Cheng, Dongdong
    Huang, Jinlong
    Zhang, Sulan
    Liu, Huijun
    IEEE ACCESS, 2019, 7 : 151339 - 151349
  • [42] Study on two-stage uncertain programming based on uncertainty theory
    Zheng, Mingfa
    Yi, Yuan
    Wang, Zutong
    Chen, Jeng-Fung
    JOURNAL OF INTELLIGENT MANUFACTURING, 2017, 28 (03) : 633 - 642
  • [43] Robust Two-stage Graph Convolutional Network for Face Clustering
    Hou, Guanqun
    Deng, Fan
    Chen, Xinjia
    Lu, Haixian
    Che, Jun
    Pu, Shiliang
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [44] Cleaning of Transient Fault Data in Distribution Network Based on Clustering by Fast Search and Find of Density Peaks
    Duan, Xiaoli
    Liu, Sanwei
    Huang, Fuyong
    Zhang, Daoyuan
    Zhao, Yan
    Duan, Jianjia
    Zeng, Zeyu
    Yu, Ting
    Zhong, Lipeng
    Dai, Bin
    ENGINEERING LETTERS, 2023, 31 (04) : 1348 - 1358
  • [45] An Initialization Method for Clustering Mixed Numeric and Categorical Data Based on the Density and Distance
    Ji, Jinchao
    Pang, Wei
    Zheng, Yanlin
    Wang, Zhe
    Ma, Zhiqiang
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2015, 29 (07)
  • [46] Density-based clustering for data containing two types of points
    Pei, Tao
    Wang, Weiyi
    Zhang, Hengcai
    Ma, Ting
    Du, Yunyan
    Zhou, Chenghu
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2015, 29 (02) : 175 - 193
  • [47] Density Peaks Clustering Based on Weighted Local Density Sequence and Nearest Neighbor Assignment
    Yu, Donghua
    Liu, Guojun
    Guo, Maozu
    Liu, Xiaoyan
    Yao, Shuang
    IEEE ACCESS, 2019, 7 : 34301 - 34317
  • [48] Constraint-based clustering by fast search and find of density peaks
    Liu, Ruhui
    Huang, Weiping
    Fei, Zhengshun
    Wang, Kai
    Liang, Jun
    NEUROCOMPUTING, 2019, 330 : 223 - 237
  • [49] An Efficient Grid-based Clustering Method by Finding Density Peaks
    Wu, Bo
    Wilamowski, B. M.
    PROCEEDINGS OF THE IECON 2016 - 42ND ANNUAL CONFERENCE OF THE IEEE INDUSTRIAL ELECTRONICS SOCIETY, 2016, : 837 - 842
  • [50] Density Peaks Clustering Based on Candidate Center and Multi Assignment Policies
    Shi, Yanli
    Bai, Luyao
    IEEE ACCESS, 2023, 11 : 57158 - 57173