Overlapping community detection by constrained personalized PageRank

被引:41
作者
Gao, Yang [1 ]
Yu, Xiangzhan [1 ]
Zhang, Hongli [1 ]
机构
[1] Harbin Inst Technol, Fac Comp, Harbin 150001, Peoples R China
关键词
Local community detection; Constrained personalized PageRank; Random walk; Redundant diffusion; ALGORITHM; LINK;
D O I
10.1016/j.eswa.2021.114682
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Given a network, local community detection (a.k.a. graph clustering) methods aim at finding communities around the selected initial nodes (also referred to as seeds, starting nodes or core nodes). Methods in this kind successfully address the efficiency problem confronted by global clustering methods. And techniques, such as personalized PageRank and heat kernel diffusion, for ranking the proximity score of vertices nearby with respect to the corresponding starting nodes are developed. However, most of the random-walk based metrics allow a walker to diffuse without any constraint, and the walker can easily run into irrelevant communities. As a result, the corresponding community could include irrelevant high-quality communities (communities with good fitness score) nearby, we refer to the case that a walker goes into irrelevant communities and causes inaccurate expansion of a community as redundant diffusion. In this work, we develop a constrained personalized PageRank method for community expansion to reduce the problem of redundant diffusion. In the mechanism, a walker moves with lower probability to neighbor nodes already in the existing communities, and a walker tends to walk out of the community if the walker walks into an irrelevant community. Extensive experiments on synthetic and large real-world networks demonstrate that the proposed method outperforms approaches in the state of the art by a large margin in accuracy and efficiency.
引用
收藏
页数:12
相关论文
共 39 条
[1]   A Separability Framework for Analyzing Community Structure [J].
Abrahao, Bruno ;
Soundarajan, Sucheta ;
Hopcroft, John ;
Kleinberg, Robert .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2014, 8 (01) :101-129
[2]   Link communities reveal multiscale complexity in networks [J].
Ahn, Yong-Yeol ;
Bagrow, James P. ;
Lehmann, Sune .
NATURE, 2010, 466 (7307) :761-U11
[3]  
Airoldi EM, 2008, J MACH LEARN RES, V9, P1981
[4]  
Andersen R, 2006, ANN IEEE SYMP FOUND, P475
[5]  
[Anonymous], 2012, P 18 ACM SIGKDD INT, DOI [DOI 10.1145/2339530.2339630, 10.1145/2339530.2339630]
[6]  
[Anonymous], 2021, EXPERT SYST APPL, V173
[7]   Bookmark-Coloring Algorithm for Personalized PageRank Computing [J].
Berkhin, Pavel .
INTERNET MATHEMATICS, 2006, 3 (01) :41-62
[8]   Fast unfolding of communities in large networks [J].
Blondel, Vincent D. ;
Guillaume, Jean-Loup ;
Lambiotte, Renaud ;
Lefebvre, Etienne .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
[9]   Comparing community structure identification -: art. no. P09008 [J].
Danon, L ;
Díaz-Guilera, A ;
Duch, J ;
Arenas, A .
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2005, :219-228
[10]   A robust two-stage algorithm for local community detection [J].
Ding, Xiaoyu ;
Zhang, Jianpei ;
Yang, Jing .
KNOWLEDGE-BASED SYSTEMS, 2018, 152 :188-199