Community discovery by propagating local and global information based on the MapReduce model

被引:67
作者
Guo, Kun [1 ,3 ]
Guo, Wenzhong [1 ,3 ]
Chen, Yuzhong [1 ,3 ]
Qiu, Qirong [2 ]
Zhang, Qishan [2 ]
机构
[1] Fuzhou Univ, Coll Math & Comp Sci, Fuzhou 350002, Peoples R China
[2] Fuzhou Univ, Sch Management, Fuzhou 350002, Peoples R China
[3] Fujian Prov Key Lab Network Comp & Intelligent In, Fuzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Affinity propagation; Community discovery; MapReduce model; Social network; COMPLEX NETWORKS;
D O I
10.1016/j.ins.2015.06.032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Discovering communities in large-scale social networks efficiently and accurately is one of the challenges in social network data mining. We propose a clustering algorithm to discover social network communities based on the propagation of local and global information. Three strategies, namely, localizing propagation of affinity messages, relaxing self-exemplar constraints, and hierarchical processing, are employed in the algorithm to achieve reasonable time and space complexities in social networks. The local and global information is represented by the k-path edge centrality incorporated in the similarity calculation. The standalone algorithm is extended to provide parallel implementations based on the MapReduce model to accelerate processing in large-scale networks. Two well-known parallel computation frameworks, Hadoop and Spark, are adopted to implement the parallel algorithm. Experiments performed on artificial and real social network datasets show that the proposed algorithms can achieve near-linear time and space complexities with comparative clustering accuracy. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:73 / 93
页数:21
相关论文
共 47 条
  • [1] Link communities reveal multiscale complexity in networks
    Ahn, Yong-Yeol
    Bagrow, James P.
    Lehmann, Sune
    [J]. NATURE, 2010, 466 (7307) : 761 - U11
  • [2] [Anonymous], 2012, Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, DOI DOI 10.1145/2213836.2213934
  • [3] Apache, 2014, AP HAD NEXTGEN MAPR
  • [4] Apache, 2014, GIR OP SOURC IMPL PR
  • [5] Fast unfolding of communities in large networks
    Blondel, Vincent D.
    Guillaume, Jean-Loup
    Lambiotte, Renaud
    Lefebvre, Etienne
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2008,
  • [6] Bu Y., 2010, P VLDB 10 SING, P24
  • [7] Clauset A, 2004, PHYS REV E, V70, DOI 10.1103/PhysRevE.70.066111
  • [8] Comparing community structure identification -: art. no. P09008
    Danon, L
    Díaz-Guilera, A
    Duch, J
    Arenas, A
    [J]. JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT, 2005, : 219 - 228
  • [9] De Meo P., 2011, Proceedings of the 2011 11th International Conference on Intelligent Systems Design and Applications (ISDA), P88, DOI 10.1109/ISDA.2011.6121636
  • [10] Enhancing community detection using a network weighting strategy
    De Meo, Pasquale
    Ferrara, Emilio
    Fiumara, Giacomo
    Provetti, Alessandro
    [J]. INFORMATION SCIENCES, 2013, 222 : 648 - 668