Efficient Detection of Overlapping Communities Using Asymmetric Triangle Cuts

被引:24
作者
Rezvani, Mojtaba [1 ]
Liang, Weifa [1 ]
Liu, Chengfei [2 ]
Yu, Jeffrey Xu [3 ]
机构
[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT 2601, Australia
[2] Swinburne Univ Technol, Dept Comp Sci & Software Engn, Hawthorn, Vic 3122, Australia
[3] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Hong Kong, Peoples R China
基金
澳大利亚研究理事会;
关键词
Overlapping community detection; fitness metrics for overlapping communities; social networks; community detection algorithms;
D O I
10.1109/TKDE.2018.2815554
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real social networks contain many communities, where members within each community are densely connected with each other, while they are sparsely connected with the members outside of the community. Since each member can join multiple communities simultaneously, communities in social networks are usually overlapping with each other. How to efficiently and effectively identify overlapping communities in a large social network becomes a fundamental problem in the big data era. Most existing studies on community finding focused on non-overlapping communities based on several well-known community fitness metrics. However, recent investigations have shown that these fitness metrics may suffer free rider and separation effects where the overlapping region of two communities always belongs to the denser one, rather to both of them. In this paper, we study the overlapping community detection problem in social networks that not only takes the quality of the found overlapping communities but also incorporate both free rider and separation effects on the found communities into consideration. Specifically, in this paper, we first propose a novel community fitness metric - triangle based fitness metric, for overlapping community detection that can minimize the free rider and separation effects on found overlapping communities, and show that the problem is NP-hard. We then propose an efficient yet scalable algorithm for the problem that can deliver a feasible solution. We finally validate the effectiveness of the proposed fitness metric and evaluate the performance of the proposed algorithm, through conducting extensive experiments on real-world datasets with over 100 million vertices and edges. Experimental results demonstrate that the proposed algorithm is very promising.
引用
收藏
页码:2093 / 2105
页数:13
相关论文
共 36 条
[11]  
Feng Luo, 2008, Web Intelligence and Agent Systems, V6, P387, DOI 10.3233/WIA-2O08-O147
[12]   Resolution limit in community detection [J].
Fortunato, Santo ;
Barthelemy, Marc .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (01) :36-41
[13]   Community detection in graphs [J].
Fortunato, Santo .
PHYSICS REPORTS-REVIEW SECTION OF PHYSICS LETTERS, 2010, 486 (3-5) :75-174
[14]   Efficient discovery of overlapping communities in massive networks [J].
Gopalan, Prem K. ;
Blei, David M. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2013, 110 (36) :14534-14539
[15]  
Huang X, 2015, PROC VLDB ENDOW, V9, P276
[16]   Querying K-Truss Community in Large and Dynamic Graphs [J].
Huang, Xin ;
Cheng, Hong ;
Qin, Lu ;
Tian, Wentao ;
Yu, Jeffrey Xu .
SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, :1311-1322
[17]   On clusterings: Good, bad and spectral [J].
Kannan, R ;
Vempala, S ;
Vetta, A .
JOURNAL OF THE ACM, 2004, 51 (03) :497-515
[18]   Detecting the overlapping and hierarchical community structure in complex networks [J].
Lancichinetti, Andrea ;
Fortunato, Santo ;
Kertesz, Janos .
NEW JOURNAL OF PHYSICS, 2009, 11
[20]   ANALYSIS OF APPROXIMATIONS FOR MAXIMIZING SUBMODULAR SET FUNCTIONS .1. [J].
NEMHAUSER, GL ;
WOLSEY, LA ;
FISHER, ML .
MATHEMATICAL PROGRAMMING, 1978, 14 (03) :265-294