Parallel Subgraph Matching on Massive Graphs

被引：0

作者：

Suo, Bo ^{[1
]}

Li, Zhanhuai ^{[1
]}

Pan, Wei ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Xian 710129, Peoples R China

来源：

2016 9TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2016) | 2016年

基金：

中国国家自然科学基金;

关键词：

ALGORITHM;

D O I：

暂无

中图分类号：

R318 [生物医学工程];

学科分类号：

0831 ;

摘要：

While numerous applications, such as social networks, protein-protein interaction networks, and bibliographic networks, mainly consist of graph-structured data, massive graphs, of which the scales range from million nodes to billion nodes, are common-place. Searching within these kinds of graphs is urged to be efficient. Unfortunately, since the subgraph isomorphism problem is NP-complete, querying on large graphs is still challenging. Most of existing approaches employ various pruning rules to facilitate the matching process on a single machine. When a data graph is large and dense, auxiliary information, known as index, and intermediate results could easily run out of computational resources. Recently, inspired by the popularity of parallel programming models, such as MapReduce and Pregel, there is a trend to solve the subgraph matching problem upon them. However, caused by the incompleteness of graph data in each cluster machine, parallel solutions of subgraph matching are often proposed in a brute-force way. In this paper, we propose a parallel subgraph matching framework which uses k-hop replication based partitioning approach to distribute the graph data across cluster machines. Benefiting from the proposed framework, the completeness of local searching can be ensured. Hence, previous studies on indexing graph data become usable and valuable for the parallel graph querying problem. For the consideration of efficiency, taking a light-weight neighborhood-based index as an example, we also propose two potential optimization opportunities for reducing intermediate results. We implement the proposed framework on Hadoop/MapReduce. Our experimental results on real-world data sets demonstrate its effectiveness on very large graphs.

引用

页码：1932 / 1937

页数：6

共 26 条

[1]

[Anonymous], DATABASE SYSTEM IMPL

[2]

Banerjee A, 2005, J MACH LEARN RES, V6, P1705

[3] A branch-and-cut algorithm for the equicut problem [J].

Brunetta, L ;

Conforti, M ;

Rinaldi, G .

MATHEMATICAL PROGRAMMING, 1997, 78 (02) :243-263

[4]

BUI T, 1989, ACM IEEE D, P775, DOI 10.1145/74382.74527

[5] SPECTRAL K-WAY RATIO-CUT PARTITIONING AND CLUSTERING [J].

CHAN, PK ;

SCHLAG, MDF ;

ZIEN, JY .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 1994, 13 (09) :1088-1096

[6] A (sub)graph isomorphism algorithm for matching large graphs [J].

Cordella, LP ;

Foggia, P ;

Sansone, C ;

Vento, M .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (10) :1367-1372

[7]

Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137

[8]

Fiduccia CM., 1988, Papers on Twentyfive years of Electronic Design Automation, P241

[9]

Giugno R., PATT REC 2002 P 16 I, V2, P112

[10]

He H., DAT ENG 2006 ICDE 06, P38

← 1 2 3 →