Distance-Join: Pattern Match Query In a Large Graph Database

被引:10
作者
Zou, Lei [1 ]
Chen, Lei [2 ]
Oezsu, M. Tamer [3 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China
[2] Hong Kong Univ Sci & Technol, Hong Kong, Hong Kong, Peoples R China
[3] Univ Waterloo, Waterloo, ON, Canada
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2009年 / 2卷 / 01期
基金
中国国家自然科学基金; 加拿大自然科学与工程研究理事会;
关键词
28;
D O I
10.14778/1687627.1687727
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The growing popularity of graph databases has generated interesting data ianageient probleis, such as subgraph search, shortest-path query, reachability verification, and pattern match. Aiong these, a pattern match query is more flexible compared to a subgraph search and more informative compared to a shortest-path or reachability query. In this paper, we address pattern match probleis over a large data graph G. Specifically, given a pattern graph (i.e., query Q), we want to find all inatches (in G) that have the similar connections as those in Q. In order to reduce the search space significantly, we first transform the vertices into points in a vector space via graph embedding techniques, converting a pattern match query into a distance-based multi-way join problem over the converted vector space. We also propose several pruning strategies and a join order selection method to process join processing efficiently. Extensive experiments on both real and synthetic datasets show that our method outperforms existing ones by orders of magnitude.
引用
收藏
页码:886 / 897
页数:12
相关论文
共 28 条
  • [11] He H., 2006, ICDE
  • [12] iDistance:: An adaptive B+-tree based indexing method for nearest neighbor search
    Jagadish, HV
    Ooi, BC
    Tan, KL
    Yu, C
    Zhang, R
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2005, 30 (02): : 364 - 397
  • [13] Jiang P. Y. H., 2007, ICDE
  • [14] Jiefeng C., 2009, EDBT
  • [15] Hierarchical encoded path views for path query processing: An optimal model and its performance evaluation
    Jing, N
    Huang, YW
    Rundensteiner, EA
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 1998, 10 (03) : 409 - 432
  • [16] LINIAL N, 1995, COMBINATORICA, V15
  • [17] A road network embedding technique for K-nearest neighbor search in moving object databases
    Shahabi, C
    Kolahdouzan, MR
    Sharifzadeh, M
    [J]. GEOINFORMATICA, 2003, 7 (03) : 255 - 273
  • [18] Shasha D., 2002, PODS
  • [19] Tian Y., 2007, BIOINFORMATICS, V23
  • [20] TALE: A tool for approximate large graph matching
    Tian, Yuanyuan
    Patel, Jignesh M.
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 963 - +