GSimRank: A General Similarity Measure on Heterogeneous Information Network

被引:4
作者
Zhang, Chuanyan [1 ]
Hong, Xiaoguang [1 ]
Peng, Zhaohui [2 ]
机构
[1] Shandong Univ, Jinan 250101, Peoples R China
[2] Shandong Univ, Qingdao 266237, Peoples R China
来源
WEB AND BIG DATA, PT I, APWEB-WAIM 2020 | 2020年 / 12317卷
关键词
Similarity measure; Heterogeneous information network; Semantic relation; Entropy; COMPUTATION;
D O I
10.1007/978-3-030-60259-8_43
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Measuring similarity of objects in information network is a primitive problem and has attracted many studies for widely applications, such as recommendation and information retrieval. With the advent of large-scale heterogeneous information network that consist of multi-type relationships, it is important to research similarity measure in such networks. However, most existing similarity measures are defined for homogeneous network and cannot be directly applied to HINs since different semantic meanings behind edges should be considered. This paper proposes GSimRank that is the extended form of the famous SimRank to compute similarity on HINs. Rather than summing all meeting paths for two nodes in SimRank, GSimRank selects linked nodes of the same semantic category as the next step in the pairwise random walk, which ensure the two meeting paths share the same semantic. Further, in order to weight the semantic edges, we propose a domain-independent edge weight evaluation method based on entropy theory. Finally, we proof that GSimRank is still based on the expected meeting distance model and provide experiments on two real world datasets showing the performance of GSimRank.
引用
收藏
页码:588 / 602
页数:15
相关论文
共 19 条
[1]  
Cai YZ, 2008, LECT NOTES ARTIF INT, V5139, P317
[2]   Metagraph-Based Learning on Heterogeneous Graphs [J].
Fang, Yuan ;
Lin, Wenqing ;
Zheng, Vincent W. ;
Wu, Min ;
Shi, Jiaqi ;
Chang, Kevin Chen-Chuan ;
Li, Xiao-Li .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2021, 33 (01) :154-168
[3]  
Fang Y, 2016, PROC INT CONF DATA, P277, DOI 10.1109/ICDE.2016.7498247
[4]  
Fogaras D, 2004, LECT NOTES COMPUT SC, V3243, P105
[5]  
Gupta M., 2008, P WORLD WID WEB C
[6]  
Jeh G, 2003, P 12 INT C WORLD WID, P271
[7]  
Jeh G., 2002, P 8 ACM SIGKDD INT C, P538, DOI DOI 10.1145/775047.775126
[8]  
Jin Ruoming, 2011, P 17 ACM SIGKDD INT, P922, DOI DOI 10.1145/2020408.2020561
[9]   Relational retrieval using a combination of path-constrained random walks [J].
Lao, Ni ;
Cohen, William W. .
MACHINE LEARNING, 2010, 81 (01) :53-67
[10]  
Li YR, 2019, AAAI CONF ARTIF INTE, P9971