Scaling Graph Clustering with Distributed Sketches

被引:1
|
作者
Priest, Benjamin W. [1 ]
Dunton, Alec [2 ]
Sanders, Geoffrey [1 ]
机构
[1] Lawrence Livermore Natl Lab, Ctr Appl Sci Comp, Livermore, CA 94550 USA
[2] Univ Colorado, Dept Appl Math, Boulder, CO 80309 USA
关键词
streaming algorithms; random matrices; graph clustering;
D O I
10.1109/hpec43674.2020.9286252
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The unsupervised learning of community structure, in particular the partitioning vertices into clusters or communities, is a canonical and well-studied problem in exploratory graph analysis. However, like most graph analyses the introduction of immense scale presents challenges to traditional methods. Spectral clustering in distributed memory, for example, requires hundreds of expensive bulk-synchronous communication rounds to compute an embedding of vertices to a few eigenvectors of a graph associated matrix. Furthermore, the whole computation might need to be repeated if the underlying graph changes some low percentage of edge updates. We present a method inspired by spectral clustering where we instead use matrix sketches derived from random dimension-reducing projections. We show that our method produces embeddings that yield performant clustering results given a fully-dynamic stochastic block model stream using both the fast Johnson-Lindenstrauss and CountSketch transforms. We also discuss the effects of stochastic block model parameters upon the required dimensionality of the subsequent embeddings, and show how random projections could significantly improve the performance of graph clustering in distributed memory.
引用
收藏
页数:7
相关论文
共 50 条
  • [1] COMPRESSIVE GRAPH CLUSTERING FROM RANDOM SKETCHES
    Chi, Yuejie
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5466 - 5469
  • [2] Distributed Graph Clustering and Sparsification
    Sun, He
    Zanetti, Luca
    ACM TRANSACTIONS ON PARALLEL COMPUTING, 2019, 6 (03)
  • [3] Distributed structural clustering on large graph
    Rong, Chuitian
    Zhou, Jinyu
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (22):
  • [4] Distributed Graph Clustering by Load Balancing
    Sun, He
    Zanetti, Luca
    PROCEEDINGS OF THE 29TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES (SPAA'17), 2017, : 163 - 171
  • [5] Graph sketches
    Abello, J
    Finocchi, I
    Korn, J
    IEEE SYMPOSIUM ON INFORMATION VISUALIZATION 2001, PROCEEDINGS, 2001, : 67 - 70
  • [6] Scaling and Quality of Modularity Optimization Methods for Graph Clustering
    Ghosh, Sayan
    Halappanavar, Mahantesh
    Tumeo, Antonino
    Kalyanaraman, Ananth
    2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [7] Distributed Exact Structural Clustering on Large Graph
    Zhou, Jinyu
    Rong, Chuitian
    Liu, Ding
    Chai, Zhengyi
    2022 IEEE 28TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, ICPADS, 2022, : 778 - 785
  • [8] Towards Graph Clustering for Distributed Computing Environments
    Szufel, Przemyslaw
    MODELLING AND MINING NETWORKS, WAW 2024, 2024, 14671 : 146 - 158
  • [9] Distributed Graph Clustering for Application in Wireless Networks
    Yu, Chia-Hao
    Qin, Shaomeng
    Alava, Mikko
    Tirkkonen, Olav
    SELF-ORGANIZING SYSTEMS, 2011, 6557 : 92 - +
  • [10] A Divide and Conquer Framework for Distributed Graph Clustering
    Yang, Wenzhuo
    Xu, Huan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 37, 2015, 37 : 504 - 513