Efficient Mining of Frequent Patterns on Uncertain Graphs

被引:38
作者
Chen, Yifan [1 ]
Zhao, Xiang [1 ,3 ]
Lin, Xuemin [4 ]
Wang, Yang [4 ,5 ]
Guo, Deke [2 ]
机构
[1] Natl Univ Def Technol, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Informat Syst & Management, Changsha 410073, Hunan, Peoples R China
[3] Collaborat Innovat Ctr Geospatial Technol, Wuhan 430079, Hubei, Peoples R China
[4] Univ New South Wales, Sydney, NSW 2052, Australia
[5] Dalian Univ Technol, Dalian 116023, Peoples R China
关键词
Uncertain graphs; frequent subgraphs; approximation algorithms; pattern mining; SUBGRAPH;
D O I
10.1109/TKDE.2018.2830336
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Uncertainty is intrinsic to a wide spectrum of real-life applications, which inevitably applies to graph data. Representative uncertain graphs are seen in bio-informatics, social networks, etc. This paper motivates the problemNof frequent subgraphmining on single uncertain graphs, and investigates two different - probabilistic and expected - semantics in terms of support definitions. First, we present an enumeration-evaluation algorithmto solve the problemunder probabilistic semantics. By showing the support computation under probabilistic semantics is #P-complete, we develop an approximation algorithm-with accuracy guarantee for efficient problem-solving. To enhance the solution, we devise computation sharing techniques to achieve better mining performance. Afterwards, the algorithmis extended in a similar flavor to handle the problemunder expected semantics, where checkpoint-based pruning and validation techniques are integrated. Experiment results on real-life datasets confirm the practical usability of the mining algorithms.
引用
收藏
页码:287 / 300
页数:14
相关论文
共 49 条
[11]   Predicting Protein Function by Frequent Functional Association Pattern Mining in Protein Interaction Networks [J].
Cho, Young-Rae ;
Zhang, Aidong .
IEEE TRANSACTIONS ON INFORMATION TECHNOLOGY IN BIOMEDICINE, 2010, 14 (01) :30-36
[12]  
Chu D., 2006, P 22 INT C DATA ENG, P48, DOI DOI 10.1109/ICDE.2006.21
[13]  
Cormen T. H., 2009, Introduction to algorithms, VThird
[14]  
Dalvi Nilesh, 2004, VLDB, V30, P864
[15]   GRAMI: Frequent Subgraph and Pattern Mining in a Single Large Graph [J].
Elseidy, Mohammed ;
Abdelhamid, Ehab ;
Skiadopoulos, Spiros ;
Kalnis, Panos .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2014, 7 (07) :517-528
[16]  
Fiedler M., 2007, P MIN LEARN GRAPHS
[17]   On a routing problem within Probabilistic graphs and its application to intermittently connected networks [J].
Ghosh, Joy ;
Ngo, Hung Q. ;
Yoon, Seokhoon ;
Qiao, Chunming .
INFOCOM 2007, VOLS 1-5, 2007, :1721-+
[18]   Subgraph similarity maximal all-matching over a large uncertain graph [J].
Gu, Yu ;
Gao, Chunpeng ;
Wang, Lulu ;
Yu, Ge .
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2016, 19 (05) :755-782
[19]   Finding reliable subgraphs from large probabilistic graphs [J].
Hintsanen, Petteri ;
Toivonen, Hannu .
DATA MINING AND KNOWLEDGE DISCOVERY, 2008, 17 (01) :3-23
[20]   Network motif identification in stochastic networks [J].
Jiang, Rui ;
Tu, Zhidong ;
Chen, Ting ;
Sun, Fengzhu .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (25) :9404-9409