Continuous Subgraph Pattern Search over Certain and Uncertain Graph Streams

被引:39
作者
Chen, Lei [1 ,2 ]
Wang, Changliang
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
[2] HeNan Univ, Sch Comp & Informat Engn, Kaifeng, Peoples R China
基金
中国国家自然科学基金;
关键词
Subgraph search; node-neighbor tree; graph streams; uncertain graph streams;
D O I
10.1109/TKDE.2010.67
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Search over graph databases has attracted much attention recently due to its usefulness in many fields, such as the analysis of chemical compounds, intrusion detection in network traffic data, and pattern matching over users' visiting logs. However, most of the existing works focus on search over static graph databases, while in many real applications, graphs are changing over time. In this paper, we investigate a new problem on continuous subgraph pattern search under the situation where multiple target graphs are constantly changing in a stream style, namely, the subgraph pattern search over graph streams. Obviously, the proposed problem is a continuous join between query patterns and graph streams where the join predicate is the existence of subgraph isomorphism. Due to the NP-completeness of subgraph isomorphism checking, to achieve the real-time monitoring of the existence of certain subgraph patterns, we would like to avoid using subgraph isomorphism verification to find the exact query-stream subgraph isomorphic pairs but to offer an approximate answer that could report all probable pairs without missing any actual answer pairs. Therefore, we propose a lightweight yet effective feature structure called Node-Neighbor Tree to filter out false candidate query-stream pairs. To reduce the computational cost, we propose a novel idea, projecting the feature structures into a numerical vector space and conducting dominant relationship checking in the projected space. We design two methods to efficiently verify dominant relationships, and thus, answer the subgraph search over graph streams efficiently. In addition to answering queries over certain graph streams, we propose a novel problem, detecting the appearance of subgraph patterns over uncertain graph streams with high probability (i.e., larger than the probability threshold specified by users). To address this problem, we not only extend the proposed solutions for certain graphs streams, but also propose a new pruning technique by utilizing the probability threshold. We substantiate our methods with extensive experiments on both certain and uncertain graph streams.
引用
收藏
页码:1093 / 1109
页数:17
相关论文
共 46 条
[1]  
ADAR E, 2007, IEEE DATA ENG B, V30, P23
[2]  
AIELLO W., 2002, HDB MASSIVE DATA SET
[3]  
[Anonymous], 1996, GRAPH ISOMORPHISM PR
[4]  
[Anonymous], 2006, P 22 INT C DAT ENG, DOI DOI 10.1109/ICDE.2006.37
[5]  
[Anonymous], P IEEE C COMP VIS PA
[6]  
[Anonymous], P 2002 IEEE INT C DA
[7]  
ASUR S, 2009, P ACM SIGKDD
[8]   SPACE/TIME TRADE/OFFS IN HASH CODING WITH ALLOWABLE ERRORS [J].
BLOOM, BH .
COMMUNICATIONS OF THE ACM, 1970, 13 (07) :422-&
[9]  
CHEN C, 2007, P 33 INT C VER LARG
[10]  
Cheng J., 2007, P ACM SIGMOD