Supports estimation via graph sampling

被引:3
作者
Wang, Xin [1 ]
Shi, Jun-Hao [1 ]
Zou, Jie-Jun [1 ]
Shen, Ling-Zhen [1 ]
Lan, Zhuo [2 ]
Fang, Yu [1 ]
Xie, Wen -Bo [1 ]
机构
[1] Southwest Petr Univ, Sch Comp Sci, Chengdu 610500, Peoples R China
[2] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Peoples R China
基金
中国国家自然科学基金;
关键词
Supports estimation; Graph sampling; Frequent pattern mining; Motifs; FREQUENT SUBGRAPH; NETWORKS;
D O I
10.1016/j.eswa.2023.122554
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Frequent pattern mining (FPM), whose goal is to identify patterns with appearance frequencies above a specified support threshold on a large graph, has attracted increasing attention owing to its diverse applications. The task suffers from an issue of support setting, as an inappropriate support threshold may lead to either extremely high computational cost along with a huge pattern set or very few patterns caused by excessive pruning. Thus, supports estimation is crucial for FPM, since an appropriate support threshold enables users to discover a limited number of patterns within affordable computational resources. In light of this, we investigate the problem of supports estimation under the constraint of pattern number B and propose a comprehensive approach to the issue. Our approach leverages a neural network to predict the support 0B that corresponds to the B-th pattern on a graph G. To train the prediction model, we first present a sampling algorithm that leverages typical motifs to produce sampled graphs of G under different sampling ratios. Owing to the typical structures brought about by motifs, the sampled graphs are embedded supports of the top -ranked patterns in proportion to the sampling ratios. We next develop techniques to efficiently infer the support 0B of the B-th pattern on each sampled graph and introduce a method to encode (sampled) graphs with 0B and other typical features, thereby forming the training data. Extensive experimental studies on real -life and synthetic graphs demonstrate the superiority of our approach, compared to other baselines.
引用
收藏
页数:13
相关论文
共 48 条
[1]   Incremental Frequent Subgraph Mining on Large Evolving Graphs [J].
Abdelhamid, Ehab ;
Canim, Mustafa ;
Sadoghi, Mohammad ;
Bhattacharjee, Bishwaranjan ;
Chang, Yuan-Chi ;
Kalnis, Panos .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2017, 29 (12) :2710-2723
[2]  
Abdelhamid E, 2016, SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, P716, DOI 10.1109/SC.2016.60
[3]   Network Sampling: From Static to Streaming Graphs [J].
Ahmed, Nesreen K. ;
Neville, Jennifer ;
Kompella, Ramana .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2014, 8 (02)
[4]  
Bringmann B, 2008, LECT NOTES ARTIF INT, V5012, P858, DOI 10.1007/978-3-540-68125-0_84
[5]  
Cheng X, 2008, INT WORKSH QUAL SERV, P249
[6]   A (sub)graph isomorphism algorithm for matching large graphs [J].
Cordella, LP ;
Foggia, P ;
Sansone, C ;
Vento, M .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (10) :1367-1372
[7]   Applications of link prediction in social networks: A review [J].
Daud, Nur Nasuha ;
Hamid, Siti Ha fizah Ab ;
Saadoon, Muntadher ;
Sahran, Firdaus ;
Anuar, Nor Badrul .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2020, 166
[8]   Mean Absolute Percentage Error for regression models [J].
de Myttenaere, Arnaud ;
Golden, Boris ;
Le Grand, Benedicte ;
Rossi, Fabrice .
NEUROCOMPUTING, 2016, 192 :38-48
[9]   Weighted random sampling with a reservoir [J].
Efraimidis, PS ;
Spirakis, PG .
INFORMATION PROCESSING LETTERS, 2006, 97 (05) :181-185
[10]   GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph [J].
Elseidy, Mohammed ;
Abdelhamid, Ehab ;
Skiadopoulos, Spiros ;
Kalnis, Panos .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (07) :517-528