A Deeper Dive into Pattern-Aware Subgraph Exploration with PEREGRINE

被引:7
作者
Jamshidi K. [1 ]
Vora K. [1 ]
机构
[1] Simon Fraser University, School of Computing Science, British Columbi
来源
Operating Systems Review (ACM) | 2021年 / 55卷 / 01期
关键词
55;
D O I
10.1145/3469379.3469381
中图分类号
学科分类号
摘要
Graph mining workloads aim to extract structural properties of a graph by exploring its subgraph structures. PEREGRINE is a general-purpose graph mining system that provides a generic runtime to efficiently explore subgraph structures of interest and perform various graph mining analyses. It takes a 'pattern-Aware' approach by incorporating a pattern-based programming model along with efficient pattern matching strategies. The programming model enables easier expression of complex graph mining use cases and enables PEREGRINE to extract the semantics of patterns. By analyzing the patterns, PEREGRINE generates efficient exploration plans which it uses to guide its subgraph exploration. In this paper, we present an in-depth view of the patternanalysis techniques powering the matching engine of PEREGRINE. Beyond the theoretical foundations from prior research, we expose opportunities based on how the exploration plans are evaluated, and develop key techniques for computation reuse, enumeration depth reduction, and branch elimination. Our experiments show the importance of patternawareness for scalable and performant graph mining where the presented new techniques speed up the performance by up to two orders of magnitude on top of the benefits achieved from the prior theoretical foundations that generate the initial exploration plans. © 2021 Copyright is held by the owner/author(s).
引用
收藏
页码:1 / 10
页数:9
相关论文
共 55 条
[21]  
Hong S., Depner S., Manhardt T., Van Der Lugt J., Verstraaten M., Chafi H., PGX.D a fast distributed graph processing engine, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Anal-ysis (SC ?15, pp. 1-12, (2015)
[22]  
Padmanabha Iyer A., Liu Z., Jin X., Venkataraman S., Braverman V., Stoica I., ASAP: Fast, Approximate Graph Pattern Mining at Scale, Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI ?18, pp. 745-761, (2018)
[23]  
Jamshidi K., Mahadasa R., Vora K., Peregrine: A Pattern-Aware Graph Mining System, Proceedings of the Fifteenth European Conference on Computer Systems, pp. 1-16, (2020)
[24]  
Kim H., Lee J., Bhowmick Wook-Shin Han S.S., Lee J., Ko S., Jarrah M.H.A., DUALSIM Parallel Subgraph Enumeration in a Massive Graph on a Single Machine, Proceedings of the ACM International Conference on Management of Data (SIGMOD ?16, pp. 1231-1245, (2016)
[25]  
Kim J., Han W., Lee S., Park K., Yu H., OPT: A New Framework for Overlapped and Parallel Triangulation in Large-scale Graphs, Proceedings of the ACM International Con-ference on Management of Data (SIGMOD ?14, pp. 637-648, (2014)
[26]  
Kim K., Seo I., Han W., Lee J., Hong S., Chafi H., Shin H., Jeong G., TurboFlux: A Fast Continuous Subgraph Matching System for Streaming Graph Data, Proceedings of the ACM International Conference on Management of Data (SIGMOD ?18, pp. 411-426, (2018)
[27]  
Kuhl F.S., Crippen G.M., Donald K., Friesen.A combinatorial algorithm for calculating ligand binding, Journal of Computational Chemistry, 5, 1, pp. 24-34, (1984)
[28]  
Kumar P., Howie Huang H., GraphOne: A Data Store for Real-Time Analytics on Evolving Graphs, Proceedings of the USENIX Conference on File and Storage Technologies (FAST ?19, pp. 249-263, (2019)
[29]  
Lai L., Qin L., Lin X., Zhang Y., Chang L., Yang S., Scalable Distributed Subgraph Enumeration, Proceedings of the VLDB Endowment (PVLDB ?16, pp. 217-228, (2016)
[30]  
Malewicz G., Austern M.H., Bik C.A.J., Dehnert J.C., Horn I., Leiser N., Czajkowski G., Pregel: A System for Large-Scale Graph Processing, Proceedings of the ACM International Conference on Management of Data (SIGMOD ?10, pp. 135-146, (2010)