GeoSimMR: A mapreduce algorithm for detecting communities based on distance and interest in social networks

被引:0
作者
Al Aghbari Z. [1 ]
Bahutair M. [2 ]
Kamel I. [2 ]
机构
[1] Department of Computer Science, University of Sharjah
[2] Department of Electrical and Computer Engineering, University of Sharjah
来源
Data Science Journal | 2019年 / 18卷 / 01期
关键词
Communities; Geodesic location; Interest similarity; MapReduce; Social networks;
D O I
10.5334/dsj-2019-013
中图分类号
学科分类号
摘要
Analyzing social networks has received a lot of reviews in the recent literature. Many papers have been proposed to provide new techniques for mining social networks to help further study this huge amount of data. However, to the best of our knowledge, none of them considered the semantic meaning of the nodes interests while clustering the network. In this work, we propose a new algorithm, namely GeoSim, for clustering users in any social network site into communities based on the semantic meaning of the nodes interests as well as their relationships with each other. Moreover, this paper proposes a parallel version of the GeoSim algorithm that utilizes the MapReduce model to run on multiple machines simultaneously and get faster results. The two versions of the algorithm (centralized and parallel) are examined thoroughly to test their performance. The experiments show that both versions of the GeoSim algorithm achieve high community detection accuracy and scale linearly with the size of the cluster. © 2019 The Author(s).
引用
收藏
相关论文
共 25 条
  • [1] Altunbey F., Alatas B., Overlapping community detection in social networks using parliamentary optimization algorithm, International Journal of Computer Networks and Applications, 2, 1, pp. 12-19, (2015)
  • [2] Blondel V.D., Guillaume J.-L., Lambiotte R., Lefebvre E., Fast unfolding of communities in large networks, Journal of Statistical Mechanics: Theory and Experiment, 10, (2008)
  • [3] Capuruco R.A.C., Capretz L.F., Integrating Recommender Information in Social Ecosystems Decisions, 4Th European Conference on Software Architecture, pp. 143-150, (2010)
  • [4] Capuruco R.A.C., Capretz L.F., A Fuzzy-Based Inference Mechanism of Trust for Improved Social Recommenders, (2012)
  • [5] Clauset A., Newman M.E.J., Moore C.D., Finding community structure in very large networks, Phys. Rev. E, 70, (2004)
  • [6] Dean J., Ghemawat S.J., MapReduce: Simplified Data Processing on Large Clusters, Commun., ACM, 51, 1, pp. 107-113, (2008)
  • [7] Ghosh R., Lerman K., Community Detection Using a Measure of Global Influence, Proceedings of the Second International Conference on Advances in Social Network Mining and Analysis, ser.SNAKDD’08, pp. 20-35, (2010)
  • [8] Gregory S., An Algorithm to Find Overlapping Community Structure in Networks, Proceedings of the 11Th European Conference on Principles and Practice of Knowledge Discovery in Databases, Ser. PKDD 2007, 91–102, (2007)
  • [9] Gregory S., A Fast Algorithm to Find Overlapping Communities in Networks, Proceedings of the 2008 European Conference on Machine Learning and Knowledge Discovery in Databases – Part I, Ser. ECML PKDD ’08, pp. 408-423, (2008)
  • [10] Kheirkhahzadeh M., Lancichinetti A., Rosvall M., Efficient community detection of network flows for varying Markov times and bipartite networks, Physical Review E, 3, (2016)