TEA: A General-Purpose Temporal Graph Random Walk Engine

被引:6
作者
Huan, Chengying [1 ,2 ]
Song, Shuaiwen Leon [3 ]
Pandey, Santosh [4 ]
Liu, Hang [4 ]
Liu, Yongchao [5 ]
Lepers, Baptiste [6 ]
He, Changhua [5 ]
Chen, Kang [1 ]
Jiang, Jinlei [1 ]
Wu, Yongwei [1 ]
机构
[1] Tsinghua Univ, Beijing, Peoples R China
[2] Baihai Technol Inc, Guangzhou, Peoples R China
[3] Univ Sydney, Sydney, NSW, Australia
[4] Stevens Inst Technol, Hoboken, NJ USA
[5] Ant Grp, Hangzhou, Peoples R China
[6] Univ Neuchatel, Neuchatel, Switzerland
来源
PROCEEDINGS OF THE EIGHTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, EUROSYS 2023 | 2023年
基金
美国国家科学基金会;
关键词
Random walk; Graph algorithm; Temporal graph;
D O I
10.1145/3552326.3567491
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many real-world graphs are temporal in nature, where the temporal information indicates when a particular edge is changed (e.g., edge insertion and deletion). Performing random walks on such temporal graphs is of paramount value. The state-of-the-art sampling strategies are tailored for conventional static graphs and thus cannot effectively tackle the dynamic nature of temporal graphs due to several significant efficiency challenges, i.e., high sampling complexity, gigantic index space, and poor programmability. In this paper, we present TEA, the first highly-efficient general-purpose TEmporal grAph random walk engine. At its core, TEA introduces a new hybrid sampling approach that combines two Monte Carlo sampling methods together to drastically reduce space complexity and achieve high sampling speed. TEA further employs a series of algorithmic and system-level optimizations to remarkably improve the sampling efficiency, as well as provide streaming graph support. Finally, we introduce a temporal-centric programming model to ease the implementation of various random walk algorithms on temporal graphs. Experimental results demonstrate that TEA can achieve up to 3 orders of magnitude speedups over the state-of-the-art random walk engines on large temporal graphs.
引用
收藏
页码:182 / 198
页数:17
相关论文
共 54 条
[1]  
Ai LY, 2017, 2017 USENIX ANNUAL TECHNICAL CONFERENCE (USENIX ATC '17), P125
[2]  
[Anonymous], 2010, Inverse Transform Sampling
[3]  
Lee JB, 2020, Arxiv, DOI arXiv:1904.06449
[4]  
Cheng Raymond., 2012, EuroSys, P85
[5]   Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks [J].
Chiang, Wei-Lin ;
Liu, Xuanqing ;
Si, Si ;
Li, Yang ;
Bengio, Samy ;
Hsieh, Cho-Jui .
KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, :257-266
[6]  
Cochez M., 2017, P 7 INT C WEB INT MI, P1, DOI DOI 10.1145/3102254.3102279
[7]   A Survey on Network Embedding [J].
Cui, Peng ;
Wang, Xiao ;
Pei, Jian ;
Zhu, Wenwu .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (05) :833-852
[8]   metapath2vec: Scalable Representation Learning for Heterogeneous Networks [J].
Dong, Yuxiao ;
Chawla, Nitesh V. ;
Swami, Ananthram .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :135-144
[9]   HIN2Vec: Explore Meta-paths in Heterogeneous Information Networks for Representation Learning [J].
Fu, Tao-yang ;
Lee, Wang-Chien ;
Lei, Zhen .
CIKM'17: PROCEEDINGS OF THE 2017 ACM CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2017, :1797-1806
[10]   node2vec: Scalable Feature Learning for Networks [J].
Grover, Aditya ;
Leskovec, Jure .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :855-864