Improved Density Peak Clustering Algorithm Based on Choosing Strategy Automatically for Cut-off Distance and Cluster Centre

被引:6
作者
Wang, Limin [1 ]
Li, Mingyang [1 ]
Han, Xuming [2 ]
Zhou, Ruihong [1 ]
Zheng, Kaiyue [1 ]
Liu, Meihan [1 ]
机构
[1] Jilin Univ Finance & Econ, Sch Management Sci & Informat Engn, Jilin Prov Key Lab Fintech, Changchun 130117, Jilin, Peoples R China
[2] Changchun Univ Technol, Sch Comp Sci & Engn, Changchun 130012, Jilin, Peoples R China
来源
TEHNICKI VJESNIK-TECHNICAL GAZETTE | 2018年 / 25卷 / 02期
基金
美国国家科学基金会;
关键词
clustering center; cut-off distance; Density Peak Clustering Algorithm; maximum density; similarity; FAST SEARCH; OUTLIERS; FIND;
D O I
10.17559/TV-20171121052315
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Due to the defect of quick search density peak clustering algorithm required an artificial attempt to determine the cut-off distance and circle the clustering centres, density peak clustering algorithm based on choosing strategy automatically for cut-off distance and cluster center (CSA-DP) is proposed. The algorithm introduces the improved idea of determining cut-off distance and clustering centres, according to the approximate distance that maximum density sample point to minimum density sample point and the variation of similarity between the points which may be clustering centres. First, obtaining the sample point density according to the k-nearest neighbour samples and tapping the sample sorting of the distance to the maximum density point; then finding the turning position of density trends and determining the cutoff distance on the basis of the turning position; finally, in view of the density peak clustering algorithm, finding the data points which may be the centres of the cluster, comparing the similarity between them and determining the final clustering centres. The simulation results show that the improved algorithm proposed in this paper can automatically determine the cut-off distance, circle the centres, and make the clustering results become more accurate. In the end, this paper makes an empirical analysis on the stock of 147 bio pharmaceutical listed companies by using the improved algorithm, which provides a reliable basis for the classification and evaluation of listed companies. It has a wide range of applicability.
引用
收藏
页码:536 / 545
页数:10
相关论文
共 35 条
  • [1] [Anonymous], 2015, ARXIV150104267
  • [2] [Anonymous], 2015, ARXIV150505610
  • [3] Bai L., 2012, THESIS, P133
  • [4] Cai Y., 2011, COMPUTER ENG, V37, P49
  • [5] Robust path-based spectral clustering
    Chang, Hong
    Yeung, Dit-Yan
    [J]. PATTERN RECOGNITION, 2008, 41 (01) : 191 - 203
  • [6] Density peaks clustering using geodesic distances
    Du, Mingjing
    Ding, Shifei
    Xu, Xiao
    Xue, Yu
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2018, 9 (08) : 1335 - 1349
  • [7] Duan M. X., 2009, THESIS, P6
  • [8] Han Jiawei., 2006, Data Mining: Concepts and Techniques, P383
  • [9] A link density clustering algorithm based on automatically selecting density peaks for overlapping community detection
    Huang, Lan
    Wang, Guishen
    Wang, Yan
    Pang, Wei
    Ma, Qin
    [J]. INTERNATIONAL JOURNAL OF MODERN PHYSICS B, 2016, 30 (24):
  • [10] Huang X. Z., 2006, J FUJIAN NORMAL U PH, P91