Hybrid feature selection approach to identify optimal features of profile metadata to detect social bots in Twitter

被引:0
作者
Eiman Alothali
Kadhim Hayawi
Hany Alashwal
机构
[1] United Arab Emirates University,College of Information Technology
[2] Zayed University,College of Technological Innovation
来源
Social Network Analysis and Mining | 2021年 / 11卷
关键词
Bot detection; Feature selection; Supervised learning; Twitter;
D O I
暂无
中图分类号
学科分类号
摘要
The last few years have revealed that social bots in social networks have become more sophisticated in design as they adapt their features to avoid detection systems. The deceptive nature of bots to mimic human users is due to the advancement of artificial intelligence and chatbots, where these bots learn and adjust very quickly. Therefore, finding the optimal features needed to detect them is an area for further investigation. In this paper, we propose a hybrid feature selection (FS) method to evaluate profile metadata features to find these optimal features, which are evaluated using random forest, naïve Bayes, support vector machines, and neural networks. We found that the cross-validation attribute evaluation performance was the best when compared to other FS methods. Our results show that the random forest classifier with six optimal features achieved the best score of 94.3% for the area under the curve. The results maintained overall 89% accuracy, 83.8% precision, and 83.3% recall for the bot class. We found that using four features: favorites_count, verified, statuses_count, and average_tweets_per_day, achieves good performance metrics for bot detection (84.1% precision, 81.2% recall).
引用
收藏
相关论文
共 59 条
[1]  
Ariyaluran H(2019)Real-time big data processing for anomaly detection: a survey Int J Inf Manag 45 289-307
[2]  
Riyaz A(2019)Its all in a name: detecting and labeling bots by their name Comput Math Organ Theory 25 24-35
[3]  
Fariza N(2021)A real-time hostile activities analyses and detection system Appl Soft Comput 104 107175-104
[4]  
Abdullah G(2016)The rise of social bots Commun ACM 59 96-23
[5]  
Hashem IAT(2019)A large-scale behavioural analysis of bots and humans on Twitter ACM Trans Web 13 1-277
[6]  
Ejaz A(2010)Comparative study of attribute selection using gain ratio and correlation based feature selection Int J Inf Technol Knowl Manag 2 271-202
[7]  
Muhammad I(2017)Detecting fake followers in Twitter a machine learning approach Int J Mach Learn Comput 7 198-322
[8]  
Beskow DM(2018)Deep neural networks for bot detection Inf Sci 467 312-15
[9]  
Carley KM(2017)Challenges of feature selection for big data analytics IEEE Intell Syst 32 9-2217
[10]  
Dadkhah S(2009)A wrapper method for feature selection using support vector machines Inf Sci 179 2208-54601