An unsupervised method for social network spammer detection based on user information interests

被引：20

作者：

Koggalahewa, Darshika ^{[1
]}

Xu, Yue ^{[1
]}

Foo, Ernest ^{[2
]}

机构：

[1] Queensland Univ Technol, Sch Comp Sci, Brisbane, Qld, Australia

[2] Griffith Univ, Sch Informat & Commun Technol, Brisbane, Qld, Australia

来源：

JOURNAL OF BIG DATA | 2022年 / 9卷 / 01期

关键词：

Spam detection; Peer acceptance; Information interest; Unsupervised learning; Classification; CREDIBILITY; TRUST; MODEL;

D O I：

10.1186/s40537-021-00552-5

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Online Social Networks (OSNs) are a popular platform for communication and collaboration. Spammers are highly active in OSNs. Uncovering spammers has become one of the most challenging problems in OSNs. Classification-based supervised approaches are the most commonly used method for detecting spammers. Classification-based systems suffer from limitations of "data labelling", "spam drift", "imbalanced datasets" and "data fabrication". These limitations effect the accuracy of a classifier's detection. An unsupervised approach does not require labelled datasets. We aim to address the limitation of data labelling and spam drifting through an unsupervised approach.We present a pure unsupervised approach for spammer detection based on the peer acceptance of a user in a social network to distinguish spammers from genuine users. The peer acceptance of a user to another user is calculated based on common shared interests over multiple shared topics between the two users. The main contribution of this paper is the introduction of a pure unsupervised spammer detection approach based on users' peer acceptance. Our approach does not require labelled training datasets. While it does not better the accuracy of supervised classification-based approaches, our approach has become a successful alternative for traditional classifiers for spam detection by achieving an accuracy of 96.9%.

引用

页数：35

共 72 条

[1] Abkenar SB, 2020, TWITTER SPAM DETECTI
[2] Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network
Al-garadr, Mohammed Ali
Varathan, Kasturi Dewi
Ravana, Sri Devi
[J]. COMPUTERS IN HUMAN BEHAVIOR, 2016, 63 : 433 - 443
[3] A rough margin-based multi-task v-twin support vector machine for pattern classification
An, Ran
Xu, Yitian
Liu, Xuhua
[J]. APPLIED SOFT COMPUTING, 2021, 112
[4] A study on generic object detection with emphasis on future research directions
Arulprakash, Enoch
Aruldoss, Martin
[J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) : 7347 - 7365
[5] Asher S.R., 1990, PEER REJECTION CHILD
[6] Survey of Neural Text Representation Models
Babic, Karlo
Martincic-Ipsic, Sanda
Mestrovic, Ana
[J]. INFORMATION, 2020, 11 (11) : 1 - 32
[7] Latent Dirichlet allocation
Blei, DM
Ng, AY
Jordan, MI
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) : 993 - 1022
[8] Breiman L., 1993, METRIKA
[9] Cao C, 2014, 2014 PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL CONFERENCE ON ADVANCES IN SOCIAL NETWORKS ANALYSIS AND MINING (ASONAM 2014), P138, DOI 10.1109/ASONAM.2014.6921573
[10] Not-so-supervised: A survey of semi-supervised, multi-instance, and transfer learning in medical image analysis
Cheplygina, Veronika
de Bruijne, Marleen
Pluim, Josien P. W.
[J]. MEDICAL IMAGE ANALYSIS, 2019, 54 : 280 - 296

← 1 2 3 4 5 6 7 8 →