A hybrid classification method for Twitter spam detection based on differential evolution and random forest

被引:31
作者
Bazzaz Abkenar, Sepideh [1 ]
Mahdipour, Ebrahim [1 ]
Jameii, Seyed Mahdi [2 ]
Haghi Kashani, Mostafa [2 ]
机构
[1] Islamic Azad Univ, Sci & Res Branch, Dept Comp Engn, Tehran, Iran
[2] Islamic Azad Univ, Shahr E Qods Branch, Dept Comp Engn, Tehran, Iran
关键词
imbalanced dataset; machine learning; social networks; spam; Twitter;
D O I
10.1002/cpe.6381
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Social networking services are online platforms that are distributed across different computers over long distances. Twitter is the most popular microblogging site that allows users to share their opinions and real-world events. Due to its popularity and ease of use, Twitter has also attracted spammers. As a result, spam detection is one of the most critical problems. In order to provide a spam-free environment, it is necessary to identify and filter spam tweets as well as their owners. A hybrid method, which is based on Synthetic Minority Over-sampling TEchnique (SMOTE) and Differential Evolution (DE) strategies, is presented to enhance the spam detection rate in real Twitter datasets. SMOTE is applied to tackle the imbalanced class distribution of datasets, while DE is used to tune Random Forest (RF) hyperparameters. Compared with related work and based on evaluation results, the presented method significantly enhances the classification performance in imbalanced datasets. The detection rate of optimized RF with excellent F-1-score and Area Under the Receiver Operating Characteristic Curve (AUROC), which are 98.97% and 0.999, respectively, demonstrates the high efficiency of the proposed method.
引用
收藏
页数:20
相关论文
共 50 条
[41]   Hate Speech Detection on Twitter Using Multinomial Logistic Regression Classification Method [J].
Ginting, Purnama Sari Br ;
Irawan, Budhi ;
Setianingsih, Casi .
2019 IEEE INTERNATIONAL CONFERENCE ON INTERNET OF THINGS AND INTELLIGENCE SYSTEM (IOTAIS), 2019, :105-111
[42]   Room Occupancy Detection Based on Random Forest with Timestamp Features and ANOVA Feature Selection Method [J].
Alam S. ;
Sari R.M. ;
Alfian G. ;
Farooq U. .
Journal of Computing Science and Engineering, 2024, 18 (01) :10-18
[43]   A Fog-Augmented Machine Learning based SMS Spam Detection and Classification System [J].
Bosaeed, Sahar ;
Katib, Iyad ;
Mehmood, Rashid .
2020 FIFTH INTERNATIONAL CONFERENCE ON FOG AND MOBILE EDGE COMPUTING (FMEC), 2020, :325-330
[44]   Spam email detection using a novel multilayer classification-based decision technique [J].
Das S. ;
Mandal S. ;
Basak R. .
International Journal of Computers and Applications, 2023, 45 (09) :587-599
[45]   Two-phase fuzzy feature-filter based hybrid model for spam classification [J].
Gazal ;
Juneja, Kapil .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (10) :10339-10355
[46]   Enhancement of spam detection mechanism based on hybrid -mean clustering and support vector machine [J].
Elssied, Nadir Omer Fadl ;
Ibrahim, Othman ;
Osman, Ahmed Hamza .
SOFT COMPUTING, 2015, 19 (11) :3237-3248
[47]   A Hybrid Genetic Algorithm-Based Random Forest Model for Intrusion Detection Approach in Internet of Medical Things [J].
Norouzi, Monire ;
Gurkas-Aydin, Zeynep ;
Turna, Ozgur Can ;
Yagci, Mehmet Yavuz ;
Aydin, Muhammed Ali ;
Souri, Alireza .
APPLIED SCIENCES-BASEL, 2023, 13 (20)
[48]   A Random Forest Model for Peptide Classification Based on Virtual Docking Data [J].
Feng, Hua ;
Wang, Fangyu ;
Li, Ning ;
Xu, Qian ;
Zheng, Guanming ;
Sun, Xuefeng ;
Hu, Man ;
Xing, Guangxu ;
Zhang, Gaiping .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2023, 24 (14)
[49]   A Hybrid Approach Based on Particle Swarm Optimization and Random Forests for E-Mail Spam Filtering [J].
Faris, Hossam ;
Aljarah, Ibrahim ;
Al-Shboul, Bashar .
COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2016, PT I, 2016, 9875 :498-508
[50]   From Social Media to Public Health Surveillance: Word Embedding based Clustering Method for Twitter Classification [J].
Dai, Xiangfeng ;
Bikdash, Marwan ;
Meyer, Bradley .
SOUTHEASTCON 2017, 2017,