A hybrid classification method for Twitter spam detection based on differential evolution and random forest

被引:30
作者
Bazzaz Abkenar, Sepideh [1 ]
Mahdipour, Ebrahim [1 ]
Jameii, Seyed Mahdi [2 ]
Haghi Kashani, Mostafa [2 ]
机构
[1] Islamic Azad Univ, Sci & Res Branch, Dept Comp Engn, Tehran, Iran
[2] Islamic Azad Univ, Shahr E Qods Branch, Dept Comp Engn, Tehran, Iran
关键词
imbalanced dataset; machine learning; social networks; spam; Twitter;
D O I
10.1002/cpe.6381
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Social networking services are online platforms that are distributed across different computers over long distances. Twitter is the most popular microblogging site that allows users to share their opinions and real-world events. Due to its popularity and ease of use, Twitter has also attracted spammers. As a result, spam detection is one of the most critical problems. In order to provide a spam-free environment, it is necessary to identify and filter spam tweets as well as their owners. A hybrid method, which is based on Synthetic Minority Over-sampling TEchnique (SMOTE) and Differential Evolution (DE) strategies, is presented to enhance the spam detection rate in real Twitter datasets. SMOTE is applied to tackle the imbalanced class distribution of datasets, while DE is used to tune Random Forest (RF) hyperparameters. Compared with related work and based on evaluation results, the presented method significantly enhances the classification performance in imbalanced datasets. The detection rate of optimized RF with excellent F-1-score and Area Under the Receiver Operating Characteristic Curve (AUROC), which are 98.97% and 0.999, respectively, demonstrates the high efficiency of the proposed method.
引用
收藏
页数:20
相关论文
共 50 条
  • [21] A Hybrid Intrusion Detection Model Using EGA-PSO and Improved Random Forest Method
    Balyan, Amit Kumar
    Ahuja, Sachin
    Lilhore, Umesh Kumar
    Sharma, Sanjeev Kumar
    Manoharan, Poongodi
    Algarni, Abeer D.
    Elmannai, Hela
    Raahemifar, Kaamran
    SENSORS, 2022, 22 (16)
  • [22] An automatically recursive feature elimination method based on threshold decision in random forest classification
    Chen, Chao
    Liang, Jintao
    Sun, Weiwei
    Yang, Gang
    Meng, Xiangchao
    GEO-SPATIAL INFORMATION SCIENCE, 2024,
  • [23] A Novel Framework for Drug Synergy Prediction using Differential Evolution based Multinomial Random Forest
    Kaur, Jaspreet
    Singh, Dilbag
    Kaur, Manjit
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2019, 10 (05) : 601 - 608
  • [24] Multi-objective differential evolution based random forest for e-health applications
    Kaur, Manjit
    Gianey, Hemant Kumar
    Singh, Dilbag
    Sabharwal, Munish
    MODERN PHYSICS LETTERS B, 2019, 33 (05):
  • [25] A novel framework for drug synergy prediction using differential evolution based multinomial random forest
    Kaur J.
    Singh D.
    Kaur M.
    Intl. J. Adv. Comput. Sci. Appl., 2019, 5 (601-608): : 601 - 608
  • [26] Bio-Inspired Algorithm Based Undersampling Approach and Ensemble Learning for Twitter Spam Detection
    Kiruthika Devi, K.
    Sathish Kumar, G. A.
    INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 2024, 32 (01) : 1 - 20
  • [27] Intelligent Vision-Based Malware Detection and Classification Using Deep Random Forest Paradigm
    Roseline, S. Abijah
    Geetha, S.
    Kadry, Seifedine
    Nam, Yunyoung
    IEEE ACCESS, 2020, 8 : 206303 - 206324
  • [28] Random Forest based Fake Job Detection
    Akiti, Spandhana Reddy
    Bathini, Akash
    Kanapla, Sateesh Kumar
    2024 4TH INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND SOCIAL NETWORKING, ICPCSN 2024, 2024, : 427 - 430
  • [29] Contrast Pattern-Based Classification for Bot Detection on Twitter
    Loyola-Gonzalez, Octavio
    Monroy, Raul
    Rodriguez, Jorge
    Lopez-Cuevas, Armando
    Israel Mata-Sanchez, Javier
    IEEE ACCESS, 2019, 7 : 45800 - 45817
  • [30] Social-spam Profile Detection based on Content Classification and User Behavior
    Thi-Hong Vuong
    Van-Hien Tran
    Minh-Duc Nguyen
    Cam-Van Thi Nguyen
    Thanh-Huyen Pham
    Mai-Vu Tran
    2016 EIGHTH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2016, : 264 - 267