A Deeper Dive into Pattern-Aware Subgraph Exploration with PEREGRINE

被引:7
作者
Jamshidi K. [1 ]
Vora K. [1 ]
机构
[1] Simon Fraser University, School of Computing Science, British Columbi
来源
Operating Systems Review (ACM) | 2021年 / 55卷 / 01期
关键词
55;
D O I
10.1145/3469379.3469381
中图分类号
学科分类号
摘要
Graph mining workloads aim to extract structural properties of a graph by exploring its subgraph structures. PEREGRINE is a general-purpose graph mining system that provides a generic runtime to efficiently explore subgraph structures of interest and perform various graph mining analyses. It takes a 'pattern-Aware' approach by incorporating a pattern-based programming model along with efficient pattern matching strategies. The programming model enables easier expression of complex graph mining use cases and enables PEREGRINE to extract the semantics of patterns. By analyzing the patterns, PEREGRINE generates efficient exploration plans which it uses to guide its subgraph exploration. In this paper, we present an in-depth view of the patternanalysis techniques powering the matching engine of PEREGRINE. Beyond the theoretical foundations from prior research, we expose opportunities based on how the exploration plans are evaluated, and develop key techniques for computation reuse, enumeration depth reduction, and branch elimination. Our experiments show the importance of patternawareness for scalable and performant graph mining where the presented new techniques speed up the performance by up to two orders of magnitude on top of the benefits achieved from the prior theoretical foundations that generate the initial exploration plans. © 2021 Copyright is held by the owner/author(s).
引用
收藏
页码:1 / 10
页数:9
相关论文
共 55 条
[1]  
Abdelhamid E., Abdelaziz I., Kalnis P., Khayyat Z., Jamour F., ScaleMine: Scalable Parallel Frequent Subgraph Mining in a Single Large Graph, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC ?16, pp. 611-6112, (2016)
[2]  
Ahmed N.K., Neville J., Rossi R.A., Duffield N., Efficient Graphlet Counting for Large Networks, IEEE International Conference on Data Mining (ICDM ?15, pp. 1-10, (2015)
[3]  
Bearman P.S., Moody J., Stovel K., Chains of Affection: The Structure of Adolescent Romantic and Sexual Networks, American Journal of So-ciology, 110, 1, pp. 44-91, (2004)
[4]  
Berlowitz D., Cohen S., Kimelfeld B., Efficient Enumeration of Maximal k-Plexes, Proceed-ings of the ACM International Conference on Manage-ment of Data (SIGMOD ?15, pp. 431-444, (2015)
[5]  
Bi F., Chang L., Lin X., Qin L., Zhang W., Efficient Subgraph Matching by Postponing Cartesian Products, Proceedings of the ACM Interna-Tional Conference on Management of Data (SIGMOD ?16, pp. 1199-1214, (2016)
[6]  
Bringmann B., Nijssen S., What Is Frequent in a Single Graph?, Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Confer-ence, 5012, pp. 858-863, (2008)
[7]  
Chen H., Liu M., Zhao Y., Yan Da Yan X., Cheng J., G-Miner: An Efficient Taskoriented Graph Mining System, Proceedings of the European Conference on Computer Systems (EuroSys ?18, pp. 321-3212, (2018)
[8]  
Cheng X., Dale C., Liu J., Statistics and Social Network of YouTube Videos, Quality of Service 2008 IWQoS 2008. 16th International Workshop on, pp. 229-238
[9]  
Chu W., Tsai M., Visual pattern discovery for architecture image classification and product image search, Proceedings of the ACM International Conference on Multimedia Retrieval (ICMR ?12, pp. 1-8, (2012)
[10]  
Danisch M., Balalau O., Sozio M., Listing K-cliques in Sparse Real-World Graphs, Proceedings of the World Wide Web Conference (WWW?18, pp. 589-598, (2018)