Near real-time twitter spam detection with machine learning techniques

被引:0
|
作者
Sun N. [1 ]
Lin G. [1 ]
Qiu J. [1 ]
Rimba P. [2 ]
机构
[1] School of Information Technology, Deakin University, Geelong
[2] Data61, CSIRO, Melbourne
关键词
classification; machine learning; Social network security; spam detection;
D O I
10.1080/1206212X.2020.1751387
中图分类号
学科分类号
摘要
The popularity of social media networks, such as Twitter, leads to an increasing number of spamming activities. Researchers employed various machine learning methods to detect Twitter spam. However, majorities of existing researches are limited to theoretically study, few of them can apply detection techniques to real-time scenario. In this paper, we bridge the gap by proposing a near real-time Twitter spam detection system, which provides near real-time tweets data acquisition, light-weight features extraction from a specific Twitter account, training detection model, and online visualizing detection results. In this system, account-based and content-based features are extracted to facilitate spam detection. The models that are applied to our Twitter spam detection system are trained based on 1.5 million public tweets and nine mainstream algorithms. In addition, in order to efficiently reduce training time spent on massive data and save the cost of model updating, a parallel computing technique is introduced to train and update the models in this system. Empirical results verify that the model can achieve satisfactory performance based on our datasets. Furthermore, we implement a near real-time Twitter spam detection system which can better protect users from combating spams. This system also acts as a tweets collection tool, allowing researchers to test the performance of trained classifiers in realistic scenarios. © 2020 Informa UK Limited, trading as Taylor & Francis Group.
引用
收藏
页码:338 / 348
页数:10
相关论文
共 50 条
  • [21] Real-time Korean voice phishing detection based on machine learning approaches
    Lee, Minyoung
    Park, Eunil
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (7) : 8173 - 8184
  • [22] DESIGN OF REAL-TIME SYSTEM BASED ON MACHINE LEARNING FOR SNORING AND OSA DETECTION
    Luo, Huaiwen
    Zhang, Lu
    Zhou, Lianyu
    Lin, Xu
    Zhang, Zehuai
    Wang, Mingjiang
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 1156 - 1160
  • [23] Real-time Korean voice phishing detection based on machine learning approaches
    Minyoung Lee
    Eunil Park
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 : 8173 - 8184
  • [24] A MACHINE LEARNING FRAMEWORK FOR REAL-TIME TRAFFIC DENSITY DETECTION
    Chen, Jing
    Tan, Evan
    Li, Zhidong
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2009, 23 (07) : 1265 - 1284
  • [25] Real-Time Slip Detection and Control Using Machine Learning
    Pereira Tavares, Alexandre Henrique
    Oliveira, S. R. J.
    XXVII BRAZILIAN CONGRESS ON BIOMEDICAL ENGINEERING, CBEB 2020, 2022, : 1363 - 1369
  • [26] A Machine Learning Method for Prediction of Stock Market Using Real-Time Twitter Data
    Albahli, Saleh
    Irtaza, Aun
    Nazir, Tahira
    Mehmood, Awais
    Alkhalifah, Ali
    Albattah, Waleed
    ELECTRONICS, 2022, 11 (20)
  • [27] Near Real-Time Flood Mapping with Weakly Supervised Machine Learning
    Vongkusolkit, Jirapa
    Peng, Bo
    Wu, Meiliu
    Huang, Qunying
    Andresen, Christian G.
    REMOTE SENSING, 2023, 15 (13)
  • [28] Panic Detection Using Machine Learning and Real-Time Biometric and Spatiotemporal Data
    Lazarou, Ilias
    Kesidis, Anastasios L.
    Hloupis, George
    Tsatsaris, Andreas
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2022, 11 (11)
  • [29] Spam Detection Using Machine Learning in R
    Kumari, K. R. Vidya
    Kavitha, C. R.
    INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND COMMUNICATION TECHNOLOGIES (ICCNCT 2018), 2019, 15 : 55 - 64
  • [30] Comparison of Machine Learning Algorithms for Spam Detection
    Sadia, Azeema
    Bashir, Fatima
    Khan, Reema Qaiser
    Bashir, Amna
    Khalid, Ammarah
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (02) : 178 - 184