A novel clustering algorithm based on PageRank and minimax similarity

被引:0
作者
Qidong Liu
Ruisheng Zhang
Xin Liu
Yunyun Liu
Zhili Zhao
Rongjing Hu
机构
[1] Lanzhou University,School of Information Science and Engineering
[2] Lanzhou University,School of Higher Education
来源
Neural Computing and Applications | 2019年 / 31卷
关键词
Cluster analysis; Density-based clustering; Minimax similarity; PageRank;
D O I
暂无
中图分类号
学科分类号
摘要
Clustering by fast search and find of density peaks (herein called FDPC), as a recently proposed density-based clustering algorithm, has attracted the attention of many researchers since it can recognize arbitrary-shaped clusters. In addition, FDPC needs only one parameter dc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_c$$\end{document} and identifies the number of clusters by decision graph. Nevertheless, it is not clear how to find a proper dc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d_c$$\end{document} for a given data set and such a perfect parameter may not exist in practice for the multi-scale data set. In this paper, we proposed a modified PageRank algorithm to compute the local density for each data point which is more robust than Gaussian kernel and cutoff method. Besides, FDPC yields poor results on the random distribution data sets since there may be several maxima for one cluster. To solve this problem, we proposed an improved minimax similarity method. Comparing our proposed approach with FDPC on some artificial and real-life data sets, the experimental results indicate that our proposed approach outperforms FDPC in terms of accuracy.
引用
收藏
页码:7769 / 7780
页数:11
相关论文
共 113 条
[1]  
Kashyap M(2017)A density invariant approach to clustering Neural Comput Appl 28 1695-1713
[2]  
Bhattacharya M(2015)A novel algorithm for fast and scalable subspace clustering of high-dimensional data J Big Data 2 17-649
[3]  
Kaur A(2017)Representative points clustering algorithm based on density factor and relevant degree Int J Mach Learn Cybernet 8 641-768
[4]  
Datta A(2009)A novel pruning approach for robust data clustering Neural Comput Appl 18 759-3322
[5]  
Wu D(2017)Precocious identification of popular topics on Twitter with the employment of predictive clustering Neural Comput Appl 28 3317-612
[6]  
Ren J(2018)Peer sampling gossip-based distributed clustering algorithm for unstructured P2P networks Neural Comput Appl 29 593-1446
[7]  
Sheng L(2014)Extensions of kmeans-type algorithms: a new clustering framework by integrating intracluster compactness and intercluster separation IEEE Trans Neural Netw Learn Syst 25 1433-976
[8]  
Yang XL(2007)Clustering by passing messages between data points Science 315 972-48
[9]  
Song Q(2016)A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method Pattern Recogn 58 39-8
[10]  
Wu YL(1980)A convergence theorem for the fuzzy ISODATA clustering algorithms IEEE Trans Pattern Anal Mach Intell 1 1-80