Efficient and Effective Similarity Search over Bipartite Graphs

被引:5
作者
Yang, Renchi [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
来源
PROCEEDINGS OF THE ACM WEB CONFERENCE 2022 (WWW'22) | 2022年
关键词
Bipartite Graphs; Similarity Search; Approximate Algorithms; PERSONALIZED PAGERANK QUERIES; RANDOM-WALK; COMPUTATION; ALGORITHMS;
D O I
10.1145/3485447.3511959
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Similarity search over a bipartite graph aims to retrieve from the graph the nodes that are similar to each other, which finds applications in various fields such as online advertising, recommender systems etc. Existing similarity measures either (i) overlook the unique properties of bipartite graphs, or (ii) fail to capture highorder information between nodes accurately, leading to suboptimal result quality. Recently, Hidden Personalized PageRank (HPP) is applied to this problem and found to be more effective compared with prior similarity measures. However, existing solutions for HPP computation incur significant computational costs, rendering it inefficient especially on large graphs. In this paper, we first identify an inherent drawback of HPP and overcome it by proposing bidirectional HPP (BHPP). Then, we formulate similarity search over bipartite graphs as the problem of approximate BHPP computation, and present an efficient solution Approx-BHPP. Specifically, Approx-BHPP offers rigorous theoretical accuracy guarantees with optimal computational complexity by combining deterministic graph traversal with matrix operations in an optimized and non-trivial way. Moreover, our solution achieves significant gain in practical efficiency due to several carefully-designed optimizations. Extensive experiments, comparing BHPP against 8 existing similarity measures over 7 real bipartite graphs, demonstrate the effectiveness of BHPP on query rewriting and item recommendation. Moreover, Approx-BHPP outperforms baseline solutions often by up to orders of magnitude in terms of computational time on both small and large datasets.
引用
收藏
页码:308 / 318
页数:11
相关论文
共 86 条
[1]   Friends and neighbors on the Web [J].
Adamic, LA ;
Adar, E .
SOCIAL NETWORKS, 2003, 25 (03) :211-230
[2]  
Anastasakos T., 2009, P 18 ACM C INF KNOWL, V1, P1927, DOI [10.1145/1645953.1646267, DOI 10.1145/1645953.1646267]
[3]  
Andersen R, 2006, ANN IEEE SYMP FOUND, P475
[4]   Local Computation of PageRank Contributions [J].
Andersen, Reid ;
Borgs, Christian ;
Chayes, Jennifer ;
Hopcroft, John ;
Mirrokni, Vahab ;
Teng, Shang-Hua .
INTERNET MATHEMATICS, 2008, 5 (1-2) :23-45
[5]  
[Anonymous], 2008, P CIKM
[6]  
[Anonymous], 2015, Avito Context Ad Clicks
[7]  
[Anonymous], 2003, MovieLens 1M Dataset
[8]  
[Anonymous], 2012, KDD Cup 2012, Track 2
[9]  
[Anonymous], 2006, InfoScale
[10]  
[Anonymous], 2008, P 14 ACM SIGKDD INT, DOI DOI 10.1145/1401890.1401944