Scaling Graph Clustering with Distributed Sketches

被引:1
|
作者
Priest, Benjamin W. [1 ]
Dunton, Alec [2 ]
Sanders, Geoffrey [1 ]
机构
[1] Lawrence Livermore Natl Lab, Ctr Appl Sci Comp, Livermore, CA 94550 USA
[2] Univ Colorado, Dept Appl Math, Boulder, CO 80309 USA
关键词
streaming algorithms; random matrices; graph clustering;
D O I
10.1109/hpec43674.2020.9286252
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The unsupervised learning of community structure, in particular the partitioning vertices into clusters or communities, is a canonical and well-studied problem in exploratory graph analysis. However, like most graph analyses the introduction of immense scale presents challenges to traditional methods. Spectral clustering in distributed memory, for example, requires hundreds of expensive bulk-synchronous communication rounds to compute an embedding of vertices to a few eigenvectors of a graph associated matrix. Furthermore, the whole computation might need to be repeated if the underlying graph changes some low percentage of edge updates. We present a method inspired by spectral clustering where we instead use matrix sketches derived from random dimension-reducing projections. We show that our method produces embeddings that yield performant clustering results given a fully-dynamic stochastic block model stream using both the fast Johnson-Lindenstrauss and CountSketch transforms. We also discuss the effects of stochastic block model parameters upon the required dimensionality of the subsequent embeddings, and show how random projections could significantly improve the performance of graph clustering in distributed memory.
引用
收藏
页数:7
相关论文
共 50 条
  • [21] DSCAN: Distributed Structural Graph Clustering for Billion-Edge Graphs
    Shiokawa, Hiroaki
    Takahashi, Tomokatsu
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2020, PT I, 2020, 12391 : 38 - 54
  • [22] SGP: A Parallel Computing Framework for Supporting Distributed Structural Graph Clustering
    Xia, Xiufeng
    Fang, Peng
    An, Yunzhe
    Zhu, Rui
    Zong, Chuanyu
    ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT III, 2022, 13157 : 722 - 736
  • [23] DACA: Distributed adaptive grid decision graph based clustering algorithm
    He, Jing
    Zhou, Jun
    Wang, Haoyu
    Cai, Li
    SOFTWARE-PRACTICE & EXPERIENCE, 2022, 52 (05): : 1199 - 1215
  • [24] A semi-supervised clustering algorithm based on local scaling graph and label propagation
    Hu, Jiani
    2011 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT), VOLS 1-4, 2012, : 1059 - 1062
  • [25] D-SAG: A Distributed Sort-Based Algorithm for Graph Clustering
    Saeed, Yaman A.
    Ismail, Raed A.
    Al-Haj Baddar, Sherenaz W.
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2021, 46 (09) : 8665 - 8676
  • [26] D-SAG: A Distributed Sort-Based Algorithm for Graph Clustering
    Yaman A. Saeed
    Raed A. Ismail
    Sherenaz W. Al-Haj Baddar
    Arabian Journal for Science and Engineering, 2021, 46 : 8665 - 8676
  • [27] Graph clustering
    Schaeffer, Satu Elisa
    COMPUTER SCIENCE REVIEW, 2007, 1 (01) : 27 - 64
  • [28] DDA SCALING GRAPH
    LEAKE, RJ
    ALTHAUS, HL
    IEEE TRANSACTIONS ON COMPUTERS, 1968, C 17 (01) : 81 - &
  • [29] Graph Matching-Based Distributed Clustering and Backbone Formation Algorithms for Sensor Networks
    Dagdeviren, Orhan
    Erciyes, Kayhan
    COMPUTER JOURNAL, 2010, 53 (10): : 1553 - 1575
  • [30] An efficient clustering and load balancing of distributed cloud data centers using graph theory
    Devi, R. Kanniga
    Murugaboopathi, G.
    INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2019, 32 (05)