GDPC: generalized density peaks clustering algorithm based on order similarity

被引:8
作者
Yang, Xiaofei [1 ,2 ]
Cai, Zhiling [1 ]
Li, Ruijia [1 ]
Zhu, William [1 ]
机构
[1] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Peoples R China
[2] Xian Polytech Univ, Sch Sci, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; Order similarity; Density; Density peak; Graph; K-NEAREST NEIGHBORS; FAST SEARCH; FIND;
D O I
10.1007/s13042-020-01198-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a fundamental approach to discover the valuable information in data mining and machine learning. Density peaks clustering is a typical density based clustering and has received increasing attention in recent years. However DPC and most of its improvements still suffer from some drawbacks. For example, it is difficult to find peaks in the sparse cluster regions; assignment for the remaining points tends to cause Domino effect, especially for complicated data. To address the above two problems, we propose generalized density peaks clustering algorithm (GDPC) based on a new order similarity, which is calculated by the order rank of Euclidean distance between two samples. The order similarity can help us to find peaks in the sparse regions. In addition, a two-step assignment is used to weaken Domino effect. In general, GDPC can not only discover clusters in datasets regardless of different sizes, dimensions and shapes, but also address the above two issues. Several experiments on datasets, including Lung, COIL20, ORL, USPS, Mnist, breast and Vote, show that our algorithm is effective in most cases.
引用
收藏
页码:719 / 731
页数:13
相关论文
共 41 条
  • [1] Ankerst M., 1999, SIGMOD Record, V28, P49, DOI 10.1145/304181.304187
  • [2] Asuncion Arthur, 2007, UCI machine learning repository
  • [3] Bennett KP, 1992, OPTIMIZATION METHODS, V1, P23, DOI [DOI 10.1080/10556789208805504, 10.1080/10556789208805504.25]
  • [4] Robust path-based spectral clustering
    Chang, Hong
    Yeung, Dit-Yan
    [J]. PATTERN RECOGNITION, 2008, 41 (01) : 191 - 203
  • [5] Chen K, 2018, PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON INFORMATION SYSTEMS AND COMPUTER AIDED EDUCATION (ICISCAE 2018), P426, DOI 10.1109/ICISCAE.2018.8666829
  • [6] Parallel Spectral Clustering in Distributed Systems
    Chen, Wen-Yen
    Song, Yangqiu
    Bai, Hongjie
    Lin, Chih-Jen
    Chang, Edward Y.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (03) : 568 - 586
  • [7] Deng L., 2012, IEEE Signal Process. Mag., V29, P141, DOI [DOI 10.1109/MSP.2012.2211477, 10.1109/MSP.2012.2211477]
  • [8] Automatic clustering based on density peak detection using generalized extreme value distribution
    Ding, Jiajun
    He, Xiongxiong
    Yuan, Junqing
    Jiang, Bo
    [J]. SOFT COMPUTING, 2018, 22 (09) : 2777 - 2796
  • [9] Density peaks clustering using geodesic distances
    Du, Mingjing
    Ding, Shifei
    Xu, Xiao
    Xue, Yu
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (08) : 1335 - 1349
  • [10] Study on density peaks clustering based on k-nearest neighbors and principal component analysis
    Du, Mingjing
    Ding, Shifei
    Jia, Hongjie
    [J]. KNOWLEDGE-BASED SYSTEMS, 2016, 99 : 135 - 145