Flexible density peak clustering for real-world data

被引:2
|
作者
Hou, Jian [1 ]
Lin, Houshen [1 ]
Yuan, Huaqiang [1 ]
Pelillo, Marcello [2 ,3 ]
机构
[1] Dongguan Univ Technol, Sch Comp Sci & Technol, Dongguan 523808, Peoples R China
[2] Ca Foscari Univ, DAIS, I-30172 Venice, Italy
[3] Ca Foscari Univ, European Ctr Living Technol, I-30123 Venice, Italy
基金
中国国家自然科学基金;
关键词
Clustering; Density peak; Real-world data; Number of clusters; FAST SEARCH; K-MEANS; FIND;
D O I
10.1016/j.patcog.2024.110772
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In density based clustering, the density peak algorithm has attracted much attention due to its effectiveness and simplicity, and a vast amount of clustering approaches have been proposed based on this algorithm. Some of these works require manual selection of cluster centers with a decision graph, where human involvement leads to uncertainty in clustering results. In order to avoid human involvement, some other algorithms depend on user-specified number of clusters to determine cluster centers automatically. However, it is well known that accurate estimation of number of clusters is a long-standing difficulty in data clustering. In this paper we present a sequential density peak clustering algorithm to extract clusters one by one, thereby determining the number of clusters automatically and avoiding manual selection of cluster centers in the meanwhile. Starting from a density peak, our algorithm generates an initial cluster surrounding the density peak in the first step, and then obtains the final cluster by expanding the initial cluster based on the relative density relationship among neighboring data points. With a peeling-off strategy, we obtain all the clusters sequentially. Our algorithm works well with clusters of Gaussian distribution and is therefore potential for clustering of real-world data. Experiments with a large number of synthetic and real datasets and comparisons with existing algorithms demonstrate the effectiveness of the proposed algorithm.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Towards Parameter-Free Clustering for Real-World Data
    Hou, Jian
    Yuan, Huaqiang
    Pelillo, Marcello
    PATTERN RECOGNITION, 2023, 134
  • [2] Accelerating Density Peak Clustering Algorithm
    Lin, Jun-Lin
    SYMMETRY-BASEL, 2019, 11 (07):
  • [3] Density Ratio Peak Clustering
    Wang, Shuliang
    Liu, Xiaojia
    Li, Qi
    Yuan, Hanning
    Yuan, Ye
    Feng, Ziwen
    Zhang, Fan
    WEB AND BIG DATA, PT IV, APWEB-WAIM 2023, 2024, 14334 : 467 - 482
  • [4] A Robust Density Clustering Algorithm Based on Gravity Peak
    Zhang, Rui
    Du, Tao
    Qu, Shouning
    Zhu, Lianjiang
    Wang, Xintang
    2019 6TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2019), 2019, : 544 - 549
  • [5] Improving Density Peak Clustering by Automatic Peak Selection and Single Linkage Clustering
    Lin, Jun-Lin
    Kuo, Jen-Chieh
    Chuang, Hsing-Wang
    SYMMETRY-BASEL, 2020, 12 (07):
  • [6] Sequential Clustering for Real-World Datasets
    Huang, Chongwei
    Hou, Jian
    Yuan, Huaqiang
    PRICAI 2024: TRENDS IN ARTIFICIAL INTELLIGENCE, PT I, 2025, 15281 : 69 - 80
  • [7] Active learning for density peak clustering
    Viet-Vu Vu
    Yoon, Byeongnam
    Cuong Le
    Hong-Quan Do
    Hai-Minh Nguyen
    Chung Tran
    Viet-Thang Vu
    Cong-Mau Tran
    Doan-Vinh Tran
    Tien-Dung Duong
    2022 24TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT): ARITIFLCIAL INTELLIGENCE TECHNOLOGIES TOWARD CYBERSECURITY, 2022, : 442 - +
  • [8] Adaptive core fusion-based density peak clustering for complex data with arbitrary shapes and densities
    Fang, Fang
    Qiu, Lei
    Yuan, Shenfang
    PATTERN RECOGNITION, 2020, 107
  • [9] Data Science Methods for Real-World Evidence Generation in Real-World Data
    Liu, Fang
    ANNUAL REVIEW OF BIOMEDICAL DATA SCIENCE, 2024, 7 : 201 - 224
  • [10] Density peak clustering based on relative density relationship
    Hou, Jian
    Zhang, Aihua
    Qi, Naiming
    PATTERN RECOGNITION, 2020, 108 (108)