An unsupervised method for social network spammer detection based on user information interests

被引:20
作者
Koggalahewa, Darshika [1 ]
Xu, Yue [1 ]
Foo, Ernest [2 ]
机构
[1] Queensland Univ Technol, Sch Comp Sci, Brisbane, Qld, Australia
[2] Griffith Univ, Sch Informat & Commun Technol, Brisbane, Qld, Australia
关键词
Spam detection; Peer acceptance; Information interest; Unsupervised learning; Classification; CREDIBILITY; TRUST; MODEL;
D O I
10.1186/s40537-021-00552-5
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Online Social Networks (OSNs) are a popular platform for communication and collaboration. Spammers are highly active in OSNs. Uncovering spammers has become one of the most challenging problems in OSNs. Classification-based supervised approaches are the most commonly used method for detecting spammers. Classification-based systems suffer from limitations of "data labelling", "spam drift", "imbalanced datasets" and "data fabrication". These limitations effect the accuracy of a classifier's detection. An unsupervised approach does not require labelled datasets. We aim to address the limitation of data labelling and spam drifting through an unsupervised approach.We present a pure unsupervised approach for spammer detection based on the peer acceptance of a user in a social network to distinguish spammers from genuine users. The peer acceptance of a user to another user is calculated based on common shared interests over multiple shared topics between the two users. The main contribution of this paper is the introduction of a pure unsupervised spammer detection approach based on users' peer acceptance. Our approach does not require labelled training datasets. While it does not better the accuracy of supervised classification-based approaches, our approach has become a successful alternative for traditional classifiers for spam detection by achieving an accuracy of 96.9%.
引用
收藏
页数:35
相关论文
共 72 条
  • [1] Abkenar SB, 2020, TWITTER SPAM DETECTI
  • [2] Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network
    Al-garadr, Mohammed Ali
    Varathan, Kasturi Dewi
    Ravana, Sri Devi
    [J]. COMPUTERS IN HUMAN BEHAVIOR, 2016, 63 : 433 - 443
  • [3] A rough margin-based multi-task v-twin support vector machine for pattern classification
    An, Ran
    Xu, Yitian
    Liu, Xuhua
    [J]. APPLIED SOFT COMPUTING, 2021, 112
  • [4] A study on generic object detection with emphasis on future research directions
    Arulprakash, Enoch
    Aruldoss, Martin
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 7347 - 7365
  • [5] Asher S.R., 1990, PEER REJECTION CHILD
  • [6] Survey of Neural Text Representation Models
    Babic, Karlo
    Martincic-Ipsic, Sanda
    Mestrovic, Ana
    [J]. INFORMATION, 2020, 11 (11) : 1 - 32
  • [7] Latent Dirichlet allocation
    Blei, DM
    Ng, AY
    Jordan, MI
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
  • [8] Breiman L., 1993, METRIKA
  • [9] Cao C, 2014, 2014 PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2014), P138, DOI 10.1109/ASONAM.2014.6921573
  • [10] Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
    Cheplygina, Veronika
    de Bruijne, Marleen
    Pluim, Josien P. W.
    [J]. MEDICAL IMAGE ANALYSIS, 2019, 54 : 280 - 296