Efficient Detection of Overlapping Communities Using Asymmetric Triangle Cuts

被引:24
作者
Rezvani, Mojtaba [1 ]
Liang, Weifa [1 ]
Liu, Chengfei [2 ]
Yu, Jeffrey Xu [3 ]
机构
[1] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT 2601, Australia
[2] Swinburne Univ Technol, Dept Comp Sci & Software Engn, Hawthorn, Vic 3122, Australia
[3] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Hong Kong, Hong Kong, Peoples R China
基金
澳大利亚研究理事会;
关键词
Overlapping community detection; fitness metrics for overlapping communities; social networks; community detection algorithms;
D O I
10.1109/TKDE.2018.2815554
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real social networks contain many communities, where members within each community are densely connected with each other, while they are sparsely connected with the members outside of the community. Since each member can join multiple communities simultaneously, communities in social networks are usually overlapping with each other. How to efficiently and effectively identify overlapping communities in a large social network becomes a fundamental problem in the big data era. Most existing studies on community finding focused on non-overlapping communities based on several well-known community fitness metrics. However, recent investigations have shown that these fitness metrics may suffer free rider and separation effects where the overlapping region of two communities always belongs to the denser one, rather to both of them. In this paper, we study the overlapping community detection problem in social networks that not only takes the quality of the found overlapping communities but also incorporate both free rider and separation effects on the found communities into consideration. Specifically, in this paper, we first propose a novel community fitness metric - triangle based fitness metric, for overlapping community detection that can minimize the free rider and separation effects on found overlapping communities, and show that the problem is NP-hard. We then propose an efficient yet scalable algorithm for the problem that can deliver a feasible solution. We finally validate the effectiveness of the proposed fitness metric and evaluate the performance of the proposed algorithm, through conducting extensive experiments on real-world datasets with over 100 million vertices and edges. Experimental results demonstrate that the proposed algorithm is very promising.
引用
收藏
页码:2093 / 2105
页数:13
相关论文
共 36 条
[1]  
Aggarwal C. C., 2004, US Patent, Patent No. 6,714,975
[2]  
[Anonymous], 2007, P 16 INT C WORLD WID
[3]  
[Anonymous], [No title captured]
[4]  
[Anonymous], 2013, P 6 ACM INT C WEB SE, DOI [DOI 10.1145/2433396.2433471, 10.1145/2433396.2433471]
[5]  
[Anonymous], 2012, Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, DOI [10.1145/2339530.2339628, DOI 10.1145/2339530.2339628]
[6]   FOCS: Fast Overlapped Community Search [J].
Bandyopadhyay, Sanghamitra ;
Chowdhary, Garisha ;
Sengupta, Debarka .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2015, 27 (11) :2974-2985
[7]   Higher-order organization of complex networks [J].
Benson, Austin R. ;
Gleich, David F. ;
Leskovec, Jure .
SCIENCE, 2016, 353 (6295) :163-166
[8]  
Cohen J., 2008, National Security Agency Technical Report, V16, P1
[9]  
Coscia M., 2012, PROC 18 ACM SIGKDD, P615, DOI [10.1145/2339530.2339630, DOI 10.1145/2339530.2339630]
[10]   Overlapping community detection using neighborhood ratio matrix [J].
Eustace, Justine ;
Wang, Xingyuan ;
Cui, Yaozu .
PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2015, 421 :510-521