Learning-Based Dichotomy Graph Sketch for Summarizing Graph Streams with High Accuracy

被引:1
作者
Li, Ding [1 ]
Li, Wenzhong [1 ]
Chen, Yizhou [1 ]
Zhong, Xu [1 ]
Lin, Mingkai [1 ]
Lu, Sanglu [1 ]
机构
[1] Nanjing Univ, State Key Lab Novel Software Technol, Nanjing 210023, Peoples R China
来源
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT II, KSEM 2023 | 2023年 / 14118卷
基金
中国国家自然科学基金;
关键词
Sketch; Graph sketch; Deep learning; Graph stream;
D O I
10.1007/978-3-031-40286-9_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph stream data is widely applied to describe the relationships in networks such as social networks, computer networks and hyperlink networks. Due to the large volume and high dynamicity of graph streams, several graph sketches were proposed to summarize them for fast queries. However, the existing graph sketches suffer from low performance on graph query tasks due to hash collisions between heavy and light edges. In this paper, we propose a novel learning-based Dichotomy Graph Sketch (DGS) mechanism, which adopts two separate graph sketches, a heavy sketch and a light sketch, to store heavy edges and light edges respectively. DGS periodically obtains heavy edges and light edges in a session of a graph stream, and use them as training samples to train a deep neural network (DNN) based binary classifier. The DNN-based classifier is then utilized to decide whether the upcoming edges are heavy or not, and store them in different graph sketches accordingly. With the learnable classifier and the dichotomy graph sketches, the proposed mechanism can resolve the hashing collision problem and significantly improve the accuracy for graph query tasks. We conducted extensive experiments on three real-world graph stream datasets, which show that DGS outperforms the state-of-the-art graph sketches in a variety of graph query tasks.
引用
收藏
页码:47 / 59
页数:13
相关论文
共 16 条
[1]   Finding frequent items in data streams [J].
Charikar, M ;
Chen, K ;
Farach-Colton, M .
THEORETICAL COMPUTER SCIENCE, 2004, 312 (01) :3-15
[2]   An improved data stream summary: the count-min sketch and its applications [J].
Cormode, G ;
Muthukrishnan, S .
JOURNAL OF ALGORITHMS-COGNITION INFORMATICS AND LOGIC, 2005, 55 (01) :58-75
[3]   New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice [J].
Estan, C ;
Varghese, G .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2003, 21 (03) :270-313
[4]   Graph Stream Sketch: Summarizing Graph Streams With High Speed and Accuracy [J].
Gou, Xiangyang ;
Zou, Lei ;
Zhao, Chenxingyu ;
Yang, Tong .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (06) :5901-5914
[5]   Fast and Accurate Graph Stream Summarization [J].
Gou, Xiangyang ;
Zou, Lei ;
Zhao, Chenxingyu ;
Yang, Tong .
2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, :1118-1129
[6]   Graph Summarization with Controlled Utility Loss [J].
Hajiabadi, Mandi ;
Singh, Jasbir ;
Srinivasan, Venkatesh ;
Thomo, Alex .
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, :536-546
[7]   Cumulated gain-based evaluation of IR techniques [J].
Järvelin, K ;
Kekäläinen, J .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) :422-446
[8]  
Khan A, 2016, PROCEEDINGS OF THE 2016 IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING ASONAM 2016, P130, DOI 10.1109/ASONAM.2016.7752224
[9]  
Kipf T. N., 2016, INT C LEARN REPR, DOI DOI 10.48550/ARXIV.1609.02907
[10]   Learning-Based Dynamic Graph Stream Sketch [J].
Li, Ding ;
Li, Wenzhong ;
Chen, Yizhou ;
Lin, Mingkai ;
Lu, Sanglu .
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT I, 2021, 12712 :383-394