Group Centrality Maximization for Large-scale Graphs

被引:0
作者
Angriman, Eugenio [1 ]
van der Grinten, Alexander [1 ]
Bojchevski, Aleksandar [2 ]
Zuegner, Daniel [2 ]
Guennemann, Stephan [2 ]
Meyerhenke, Henning [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, Berlin, Germany
[2] Tech Univ Munich, Dept Informat, Munich, Germany
来源
2020 PROCEEDINGS OF THE SYMPOSIUM ON ALGORITHM ENGINEERING AND EXPERIMENTS, ALENEX | 2020年
关键词
Large-scale graph analysis; group centrality measure; greedy approximation; DATABASE;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The study of vertex centrality measures is a key aspect of network analysis. Naturally, such centrality measures have been generalized to groups of vertices; for popular measures it was shown that the problem of finding the most central group is NP-hard. As a result, approximation algorithms to maximize group centralities were introduced recently. Despite a nearly-linear running time, approximation algorithms for group betweenness and (to a lesser extent) group closeness are rather slow on large networks due to high constant overheads. That is why we introduce GED-Walk centrality, a new submodular group centrality measure inspired by Katz centrality. In contrast to closeness and betweenness, it considers walks of any length rather than shortest paths, with shorter walks having a higher contribution. We define algorithms that (i) efficiently approximate the GED-Walk score of a given group and (ii) efficiently approximate the (proved to be NP-hard) problem of finding a group with highest GED-Walk score. Experiments on several real-world datasets show that scores obtained by GED-Walk improve performance on common graph mining tasks such as collective classification and graph-level classification. An evaluation of empirical running times demonstrates that maximizing GED-Walk (in approximation) is two orders of magnitude faster compared to group betweenness approximation and for group sizes <= 100 one to two orders faster than group closeness approximation. For graphs with tens of millions of edges, approximate GED-Walk maximization typically needs less than one minute. Furthermore, our experiments suggest that the maximization algorithms scale linearly with the size of the input graph and the size of the group.
引用
收藏
页码:56 / 69
页数:14
相关论文
共 50 条
[1]  
Abboud A., 2015, SODA, P1681, DOI DOI 10.1137/1.9781611973730.112
[2]  
Angriman E, 2019, Arxiv, DOI arXiv:1910.13874
[3]  
Avrachenkov Konstantin, 2013, Algorithms and Models for the Web Graph. 10th International Workshop, WAW 2013. Proceedings: LNCS 8305, P56, DOI 10.1007/978-3-319-03536-9_5
[4]   On the architectural requirements for efficient execution of graph algorithms [J].
Bader, DA ;
Cong, GJ ;
Feo, J .
2005 International Conference on Parallel Processsing, Proceedings, 2005, :547-556
[5]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[6]  
Bergamini E., 2018 P 20 WORKSH ALG, P209
[7]   Computing top-k Closeness Centrality Faster in Unweighted Graphs [J].
Bergamini, Elisabetta ;
Borassi, Michele ;
Crescenzi, Pierluigi ;
Marino, Andrea ;
Meyerhenke, Henning .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2019, 13 (05)
[8]   Axioms for Centrality [J].
Boldi, Paolo ;
Vigna, Sebastiano .
INTERNET MATHEMATICS, 2014, 10 (3-4) :222-262
[9]   Into the Square On the Complexity of Some Quadratic-time Solvable Problems [J].
Borassi, Michele ;
Crescenzi, Pierluigi ;
Habib, Michel .
ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2016, 322 :51-67
[10]   Protein function prediction via graph kernels [J].
Borgwardt, KM ;
Ong, CS ;
Schönauer, S ;
Vishwanathan, SVN ;
Smola, AJ ;
Kriegel, HP .
BIOINFORMATICS, 2005, 21 :I47-I56