Efficiently detecting overlapping communities using seeding and semi-supervised learning

被引:0
作者
Changxing Shang
Shengzhong Feng
Zhongying Zhao
Jianping Fan
机构
[1] Chinese Academy of Sciences,Shenzhen Institutes of Advanced Technology
[2] University of Chinese Academy of Sciences,undefined
来源
International Journal of Machine Learning and Cybernetics | 2017年 / 8卷
关键词
Graph clustering; Social networks; Overlapping community detection;
D O I
暂无
中图分类号
学科分类号
摘要
A common scheme for discovering overlapping communities in a network is to use a seeding process followed by an expansion process. Most seeding methods are either too complex to scale to large networks or too simple to select high-quality seeds. Additionally, the non-principled functions used by most expansion methods lead to poor performances when applied to diverse networks. This paper proposes a new method that transforms a network into a corpus. Each edge is treated as a document, and all the network nodes are treated as terms of the corpus. We propose an effective seeding method that selects seeds as a training set, and a principled expansion method based on semi-supervised learning that classifies the edges. We compared our new algorithm with four other community detection algorithms on a wide range of synthetic and empirical networks. Our experimental results show that the new algorithm significantly improved the clustering performance in most cases. Furthermore, the time complexity of the new algorithm is linear with respect to the number of edges, which means that the technique can be scaled to large networks.
引用
收藏
页码:455 / 468
页数:13
相关论文
共 70 条
[1]  
Girvan M(2002)Community structure in social and biological networks Proc Natl Acad Sci 99 7821-undefined
[2]  
Newman ME(2009)Detecting the overlapping and hierarchical community structure in complex networks New J Phys 11 033015-undefined
[3]  
Lancichinetti A(2011)Seeding for pervasively overlapping communities Phys Rev E 83 066107-undefined
[4]  
Fortunato S(2009)Community detection algorithms: a comparative analysis Phys Rev E 80 056117-undefined
[5]  
Kertész J(2007)Mixture models and exploratory analysis in networks Proc Natl Acad Sci 104 9564-undefined
[6]  
Lee C(2009)Detect overlapping and hierarchical community structure in networks Phys A Stat Mech Appl 388 1706-undefined
[7]  
Reid F(1973)Algorithm 457: finding all cliques of an undirected graph Commun ACM 16 575-undefined
[8]  
McDaid A(2005)Finding communities by clustering a graph into overlapping subgraphs IADIS AC 5 97-undefined
[9]  
Hurley N(2004)Understanding inverse document frequency: on theoretical arguments for IDF J Doc 60 503-undefined
[10]  
Lancichinetti A(2006)Semi-supervised learning literature survey Comput Sci Univ Wis Madison 2 3-undefined