Learning textual features for Twitter spam detection: A systematic literature review

被引:11
作者
Abkenar, Sepideh Bazzaz [1 ]
Kashani, Mostafa Haghi [2 ]
Akbari, Mohammad [3 ,4 ]
Mahdipour, Ebrahim [1 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Tehran, Iran
[2] Islamic Azad Univ, Dept Comp Engn, Shahr Eqods Branch, Tehran, Iran
[3] Amirkabir Univ Technol, Dept Comp Sci, Tehran, Iran
[4] Inst Res Fundamental Sci IPM, Sch Comp Sci, Tehran, Iran
关键词
Spam; Twitter; Machine learning; Social networks; Systematic literature review; TWEETS; FRAMEWORK; ACCOUNTS;
D O I
10.1016/j.eswa.2023.120366
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background-Nowadays, with the rise of Internet access and mobile devices around the globe, more people are using social networks for collaboration and receiving real-time information. Twitter, the microblogging site that is becoming a critical source of communication, has also grabbed the attention of spammers to distract users. So far, researchers have introduced various defense techniques to detect spams and combat spammers' activities. To overcome this problem, many novel techniques have been offered by researchers, which have greatly enhanced spam detection performance. Objective-The purpose of this paper is to identify, taxonomically classify, and compare current Twitter spam detection approaches in a systematic way. Method-This study presents a comprehensive Systematic Literature Review (SLR) method for spam detection on Twitter regarding 70 most relevant papers published between 2010 and October 2022. Literature review analysis reveals that most of the existing Twitter spam detection techniques are based on textual content and messages (tweets) that rely on Machine Learning (ML)-based algorithms. The major differences in these ML algorithms which use various classification and clustering algorithms are related to various feature selection methods. Hence, we propose a classification based on different feature selection analyses, namely content analysis, user analysis, tweet analysis, network analysis, and hybrid analysis. Results-Various parameters are identified to investigate the Twitter spam detection approaches, and each of the papers was examined to find the research methodology and present comparative studies on current approaches. Conclusion-This paper demonstrates that the existing Twitter spam detection approaches have encountered several open issues, including scalability, streaming data analysis, and processing. The most obvious unresolved issues are spam drift and non-English tweets.
引用
收藏
页数:27
相关论文
共 132 条
[1]  
Abhijith V., 2021, 2021 INT C INN COMP
[2]   Big data analytics meets social media: A systematic review of techniques, open issues, and future directions [J].
Abkenar, Sepideh Bazzaz ;
Kashani, Mostafa Haghi ;
Mahdipour, Ebrahim ;
Jameii, Seyed Mahdi .
TELEMATICS AND INFORMATICS, 2021, 57
[3]   Twitter spam account detection based on clustering and classification methods [J].
Adewole, Kayode Sakariyah ;
Hang, Tao ;
Wu, Wanqing ;
Songs, Houbing ;
Sangaiah, Arun Kumar .
JOURNAL OF SUPERCOMPUTING, 2020, 76 (07) :4802-4837
[4]  
Afzal H, 2016, INT CONF ADV COMMUN, P710, DOI 10.1109/ICACT.2016.7423530
[5]  
Aggarwal A., 2021, Int. J. Inf. Manage. Data Insights, DOI DOI 10.1016/J.JJIMEI.2020.100004
[6]   Spam detection on Twitter using a support vector machine and users' features by identifying their interactions [J].
Ahmad, Saleh Beyt Sheikh ;
Rafie, Mahnaz ;
Ghorabie, Seyed Mojtaba .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) :11583-11605
[7]   Fog-based healthcare systems: A systematic review [J].
Ahmadi, Zahra ;
Haghi Kashani, Mostafa ;
Nikravan, Mohammad ;
Mahdipour, Ebrahim .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (30) :36361-36400
[8]  
Akbari M, 2016, AAAI CONF ARTIF INTE, P87
[9]  
Alhaura L., 2020, 2020 3 INT C COMP IN
[10]   Detecting Automatically-Generated Arabic Tweets [J].
Almerekhi, Hind ;
Elsayed, Tamer .
INFORMATION RETRIEVAL TECHNOLOGY, AIRS 2015, 2015, 9460 :123-134