Scaling and Quality of Modularity Optimization Methods for Graph Clustering

被引:15
|
作者
Ghosh, Sayan [1 ]
Halappanavar, Mahantesh [1 ]
Tumeo, Antonino [1 ]
Kalyanaraman, Ananth [2 ]
机构
[1] Pacific Northwest Natl Lab, Richland, WA 99352 USA
[2] Washington State Univ, Pullman, WA 99164 USA
关键词
D O I
10.1109/hpec.2019.8916299
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Real-world graphs exhibit structures known as "communities" or "clusters" consisting of a group of vertices with relatively high connectivity between them, as compared to the rest of the vertices in the network. Graph clustering or community detection is a fundamental graph operation used to analyze real-world graphs occurring in the areas of computational biology, cybersecurity, electrical grids, etc. Similar to other graph algorithms, owing to irregular memory accesses and inherently sequential nature, current algorithms for community detection are challenging to parallelize. However, in order to analyze large networks, it is important to develop scalable parallel implementations of graph clustering that are capable of exploiting the architectural features of modern supercomputers. In response to the 2019 Streaming Graph Challenge, we present quality and performance analysis of our distributed-memory community detection using Vite, which is our distributed memory implementation of the popular Louvain method, on the ALCF Theta supercomputer. Clustering methods such as Louvain that rely on modularity maximization are known to suffer from the resolution limit problem, preventing identification of clusters of certain sizes. Hence, we also include quality analysis of our shared-memory implementation of the Fast-tracking Resistance method, in comparison with Louvain on the challenge datasets. Furthermore, we introduce an edge-balanced graph distribution for our distributed memory implementation, that significantly reduces communication, offering up to 80% improvement in the overall execution time. In addition to performance/quality analysis, we also include details on the power/energy consumption, and memory traffic of the distributed-memory clustering implementation using real-world graphs with over a billion edges.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] Defining quality metrics for graph clustering evaluation
    Biswas, Anupam
    Biswas, Bhaskar
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 71 : 1 - 17
  • [42] Coupling Modularity Optimization and Spectral Clustering of Water Supply Network Partition Algorithm
    Yang Z.
    Zhou Y.
    Hu Z.
    Zeng W.
    Zhou Y.
    Li X.
    Feng L.
    Tongji Daxue Xuebao/Journal of Tongji University, 2021, 49 (11): : 1614 - 1620
  • [43] Scaling Multi-Objective Optimization for Clustering Malware
    MacAskill, Noah
    Wilkins, Zachary
    Zincir-Heywood, Nur
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [44] Clustering of Japanese stock returns by recursive modularity optimization for efficient portfolio diversification
    Isogai, Takashi
    JOURNAL OF COMPLEX NETWORKS, 2014, 2 (04) : 557 - 584
  • [45] Jargon and Graph Modularity on Twitter
    Dowling, Chase P.
    Corley, Courtney D.
    Farber, Robert M.
    Reynolds, William N.
    2013 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENCE AND SECURITY INFORMATICS: BIG DATA, EMERGENT THREATS, AND DECISION-MAKING IN SECURITY INFORMATICS, 2013, : 381 - 383
  • [46] Graph Modularity and Randomness Measures
    Vergara, Victor M.
    Yu, Qingbao
    Calhoun, Vince D.
    2018 IEEE SOUTHWEST SYMPOSIUM ON IMAGE ANALYSIS AND INTERPRETATION (SSIAI), 2018, : 33 - 36
  • [47] AN ALGEBRAIC ANALYSIS OF THE GRAPH MODULARITY
    Fasino, Dario
    Tudisco, Francesco
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2014, 35 (03) : 997 - 1018
  • [48] Clustering via hypergraph modularity
    Kaminski, Bogumil
    Poulin, Valerie
    Pralat, Pawel
    Szufel, Przemyslaw
    Theberge, Francois
    PLOS ONE, 2019, 14 (11):
  • [49] Natural clustering: the modularity approach
    Angelini, L.
    Marinazzo, D.
    Pellicoro, M.
    Stramaglia, S.
    JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2007,
  • [50] Clustering Methods for Agent Distribution Optimization
    Kubalik, Jiri
    Tichy, Pavel
    Sindelar, Radek
    Staron, Raymond J.
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2010, 40 (01): : 78 - 86