HeteSim: A General Framework for Relevance Measure in Heterogeneous Networks

被引:236
作者
Shi, Chuan [1 ]
Kong, Xiangnan [2 ]
Huang, Yue [1 ]
Yu, Philip S. [2 ]
Wu, Bin [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing 100088, Peoples R China
[2] Univ Illinois, Dept Comp Sci, Chicago, IL USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Heterogeneous information network; similarity search; pair-wise random walk; relevance measure; INFORMATION NETWORKS; RANDOM-WALK; GRAPH;
D O I
10.1109/TKDE.2013.2297920
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Similarity search is an important function in many applications, which usually focuses on measuring the similarity between objects with the same type. However, in many scenarios, we need to measure the relatedness between objects with different types. With the surge of study on heterogeneous networks, the relevance measure on objects with different types becomes increasingly important. In this paper, we study the relevance search problem in heterogeneous networks, where the task is to measure the relatedness of heterogeneous objects (including objects with the same type or different types). A novel measure HeteSim is proposed, which has the following attributes: (1) a uniform measure: it can measure the relatedness of objects with the same or different types in a uniform framework; (2) a path-constrained measure: the relatedness of object pairs are defined based on the search path that connects two objects through following a sequence of node types; (3) a semi-metric measure: HeteSim has some good properties (e.g., self-maximum and symmetric), which are crucial to many data mining tasks. Moreover, we analyze the computation characteristics of HeteSim and propose the corresponding quick computation strategies. Empirical studies show that HeteSim can effectively and efficiently evaluate the relatedness of heterogeneous objects.
引用
收藏
页码:2479 / 2492
页数:14
相关论文
共 29 条
[1]  
[Anonymous], 2002, P 8 ACM SIGKDD INT C
[2]  
[Anonymous], 2011, SIGKDD
[3]  
[Anonymous], 1998, P 7 INT WORLD WID WE
[4]  
[Anonymous], 2004, P 2004 VLDB C, DOI DOI 10.1016/B978-012088469-8.50074-7
[5]  
[Anonymous], P INT C BIOINF COMP
[6]  
[Anonymous], 2010, SIGKDD
[7]  
[Anonymous], 2012, Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, DOI DOI 10.1145/2339530.2339738
[8]  
Balmin A., 2004, VLDB
[9]   Towards Scaling Fully Personalized PageRank: Algorithms, Lower Bounds, and Experiments [J].
Fogaras, Daniel ;
Racz, Balazs ;
Csalogany, Karoly ;
Sarlos, Tamas .
INTERNET MATHEMATICS, 2005, 2 (03) :333-358
[10]   Random-walk computation of similarities between nodes of a graph with application to collaborative recommendation [J].
Fouss, Francois ;
Pirotte, Alain ;
Renders, Jean-Michel ;
Saerens, Marco .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2007, 19 (03) :355-369