Efficient Mining of Frequent Patterns on Uncertain Graphs

被引:38
作者
Chen, Yifan [1 ]
Zhao, Xiang [1 ,3 ]
Lin, Xuemin [4 ]
Wang, Yang [4 ,5 ]
Guo, Deke [2 ]
机构
[1] Natl Univ Def Technol, Changsha 410073, Hunan, Peoples R China
[2] Natl Univ Def Technol, Coll Informat Syst & Management, Changsha 410073, Hunan, Peoples R China
[3] Collaborat Innovat Ctr Geospatial Technol, Wuhan 430079, Hubei, Peoples R China
[4] Univ New South Wales, Sydney, NSW 2052, Australia
[5] Dalian Univ Technol, Dalian 116023, Peoples R China
关键词
Uncertain graphs; frequent subgraphs; approximation algorithms; pattern mining; SUBGRAPH;
D O I
10.1109/TKDE.2018.2830336
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Uncertainty is intrinsic to a wide spectrum of real-life applications, which inevitably applies to graph data. Representative uncertain graphs are seen in bio-informatics, social networks, etc. This paper motivates the problemNof frequent subgraphmining on single uncertain graphs, and investigates two different - probabilistic and expected - semantics in terms of support definitions. First, we present an enumeration-evaluation algorithmto solve the problemunder probabilistic semantics. By showing the support computation under probabilistic semantics is #P-complete, we develop an approximation algorithm-with accuracy guarantee for efficient problem-solving. To enhance the solution, we devise computation sharing techniques to achieve better mining performance. Afterwards, the algorithmis extended in a similar flavor to handle the problemunder expected semantics, where checkpoint-based pruning and validation techniques are integrated. Experiment results on real-life datasets confirm the practical usability of the mining algorithms.
引用
收藏
页码:287 / 300
页数:14
相关论文
共 49 条
[1]  
Adar E., 2007, IEEE Data Eng. Bull, V30, P15
[2]   Predicting protein complex membership using probabilistic network reliability [J].
Asthana, S ;
King, OD ;
Gibbons, FD ;
Roth, FP .
GENOME RESEARCH, 2004, 14 (06) :1170-1175
[3]   ExOR: Opportunistic multi-hop routing for wireless networks [J].
Biswas, S ;
Morris, R .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2005, 35 (04) :133-143
[4]   Injecting Uncertainty in Graphs for Identity Obfuscation [J].
Boldi, Paolo ;
Bonchi, Francesco ;
Gionis, Aristides ;
Tassa, Tamir .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2012, 5 (11) :1376-1387
[5]   Core Decomposition of Uncertain Graphs [J].
Bonchi, Francesco ;
Gullo, Francesco ;
Kaltenbrunner, Andreas ;
Volkovich, Yana .
PROCEEDINGS OF THE 20TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING (KDD'14), 2014, :1316-1325
[6]  
Bringmann B, 2008, LECT NOTES ARTIF INT, V5012, P858, DOI 10.1007/978-3-540-68125-0_84
[7]   gApprox: Mining frequent approximate patterns from a massive network [J].
Chen, Chen ;
Yan, Xifeng ;
Zhu, Feida ;
Han, Jiawei .
ICDM 2007: PROCEEDINGS OF THE SEVENTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, 2007, :445-+
[8]   Towards Frequent Subgraph Mining on Single Large Uncertain Graphs [J].
Chen, Yifan ;
Zhao, Xiang ;
Lin, Xuemin ;
Wang, Yang .
2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2015, :41-50
[9]  
Cheng H, 2014, FREQUENT PATTERN MIN, P307, DOI DOI 10.1007/978-3-319-07821-2_13
[10]  
Cheng YR, 2014, LECT NOTES COMPUT SC, V8422, P124, DOI 10.1007/978-3-319-05813-9_9