Scalable Motif Counting for Large-scale Temporal Graphs

被引:12
作者
Gao, Zhongqiang [1 ]
Cheng, Chuanqi [1 ]
Yu, Yanwei [1 ]
Cao, Lei [2 ]
Huang, Chao [3 ]
Dong, Junyu [1 ]
机构
[1] Ocean Univ China, Coll Comp Sci & Technol, Qingdao, Peoples R China
[2] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA USA
[3] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
来源
2022 IEEE 38TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2022) | 2022年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ICDE53745.2022.00244
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
One fundamental problem in temporal graph analysis is to count the occurrences of small connected subgraph patterns (i.e., motifs), which benefits a broad range of real-world applications, such as anomaly detection, structure prediction, and network representation learning. However, existing works focused on exacting temporal motif are not scalable to large-scale temporal graph data, due to their heavy computational costs or inherent inadequacy of parallelism. In this work, we propose a scalable parallel framework for exactly counting temporal motifs in large-scale temporal graphs. We first categorize the temporal motifs based on their distinct properties, and then design customized algorithms that offer efficient strategies to exactly count the motif instances of each category. Moreover, our compact data structures, namely triple and quadruple counters, enable our algorithms to directly identify the temporal motif instances of each category, according to edge information and relationship between edges, therefore significantly improving the counting efficiency. Based on the proposed counting algorithms, we design a hierarchical parallel framework that featuring both inter- and intra-node parallel strategies, and fully leverages the multi-threading capacity of modern CPU to concurrently count all temporal motifs. Extensive experiments on sixteen real-world temporal graph datasets demonstrate the superiority and capability of our proposed framework for temporal motif counting, achieving up to 538x speedup compared to the stateof-the-art methods. The source code of our method is available at: https://github.com/steven- ccq/FAST- temporal-motif.
引用
收藏
页码:2656 / 2668
页数:13
相关论文
共 39 条
[1]  
Ahmed N., 2020, IEEE T KNOWLEDGE DAT
[2]  
Ahmed N. K., 2015, 2015 IEEE ICDM
[3]   Graph Sample and Hold: A Framework for Big-Graph Analytics [J].
Ahmed, Nesreen K. ;
Duffield, Nick ;
Neville, Jennifer ;
Kompella, Ramana .
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, :1446-1455
[4]   Graph based anomaly detection and description: a survey [J].
Akoglu, Leman ;
Tong, Hanghang ;
Koutra, Danai .
DATA MINING AND KNOWLEDGE DISCOVERY, 2015, 29 (03) :626-688
[5]  
[Anonymous], 2013, COUNTING SAMPLING TR
[6]  
[Anonymous], 2009, INT C KNOWL DISCOV D
[7]   Higher-order organization of complex networks [J].
Benson, Austin R. ;
Gleich, David F. ;
Leskovec, Jure .
SCIENCE, 2016, 353 (6295) :163-166
[8]  
Benson Austin R, 2015, Proc SIAM Int Conf Data Min, V2015, P118
[9]   How to Count Triangles, without Seeing the Whole Graph [J].
Bera, Suman K. ;
Seshadhri, C. .
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, :306-316
[10]   A survey of community search over big graphs [J].
Fang, Yixiang ;
Huang, Xin ;
Qin, Lu ;
Zhang, Ying ;
Zhang, Wenjie ;
Cheng, Reynold ;
Lin, Xuemin .
VLDB JOURNAL, 2020, 29 (01) :353-392