A hybrid clustering approach for link prediction in heterogeneous information networks

被引:0
作者
Zahra Sadat Sajjadi
Mahdi Esmaeili
Mostafa Ghobaei-Arani
Behrouz Minaei-Bidgoli
机构
[1] Qom Branch,Department of Computer Engineering
[2] Islamic Azad University,Department of Computer Engineering
[3] Kashan Branch,School of Computer Engineering
[4] Islamic Azad University,undefined
[5] Iran University of Science and Technology,undefined
来源
Knowledge and Information Systems | 2023年 / 65卷
关键词
Social network; Graph clustering; Structural similarity; Attribute similarity; Hybrid similarity; K-Medoids;
D O I
暂无
中图分类号
学科分类号
摘要
In recent years, researchers from academic and industrial fields have become increasingly interested in social network data to extract meaningful information. This information is used in applications such as link prediction between people groups, community detection, protein module identification, etc. Therefore, the clustering technique has emerged as a solution to finding similarities between social network members. Recently, in most graph clustering solutions, the structural similarity of nodes is combined with their attribute similarity. The results of these solutions indicate that the graph's topological structure is more important. Since most social networks are sparse, these solutions often suffer from insufficient use of node features. This paper proposes a hybrid clustering approach as an application for link prediction in heterogeneous information networks (HINs). In our approach, an adjacency vector is determined for each node until, in this vector, the weight of the direct edge or the weight of the shortest communication path among every pair of nodes is considered. A similarity metric is presented that calculates similarity using the direct edge weight between two nodes and the correlation between their adjacency vectors. Finally, we evaluated the effectiveness of our proposed method using DBLP, Political blogs, and Citeseer datasets under entropy, density, purity, and execution time metrics. The simulation results demonstrate that while maintaining the cluster density significantly reduces the entropy and the execution time compared with the other methods.
引用
收藏
页码:4905 / 4937
页数:32
相关论文
共 57 条
[1]  
Nawaz W(2015)Intra graph clustering using collaborative similarity measure Distrib Parallel Databases 33 583-603
[2]  
Skabar A(2017)Clustering mixed-attribute data using random walk Procedia Comput Sci 108 988-997
[3]  
Roh G-P(2011)Online clustering algorithms for semantic-rich network trajectories J Comput Sci Eng JCSE 5 346-353
[4]  
Hwang S-W(2009)Graph clustering based on structural/attribute similarities Proc VLDB Endow 2 718-729
[5]  
Zhou Y(2011)Clustering large attributed graphs: a balance between structural and attribute similarities ACM Trans Knowl Discov Data (TKDD) 5 1-33
[6]  
Cheng H(2011)Pathsim: meta path-based top-k similarity search in heterogeneous information networks Proc VLDB Endow 4 992-1003
[7]  
Yu JX(2014)HeteSim: a general framework for relevance measure in heterogeneous networks IEEE Trans Knowl Data Eng 26 2479-2492
[8]  
Cheng H(2017)A graph clustering method for community detection in complex networks Physica A 469 551-562
[9]  
Zhou Y(2017)A novel and fast SimRank algorithm IEEE Trans Knowl Data Eng 29 572-585
[10]  
Yu JX(2017)Mutual information model for link prediction in heterogeneous complex networks Sci Rep 314 77-99