A systematic literature review on spam content detection and classification

被引:33
作者
Kaddoura, Sanaa [1 ]
Chandrasekaran, Ganesh [2 ]
Popescu, Daniela Elena [3 ]
Duraisamy, Jude Hemanth [4 ]
机构
[1] Zayed Univ, Abu Dhabi, U Arab Emirates
[2] Sri Eshwar Coll Engn, Elect & Commun Engn, Coimbatore, Tamil Nadu, India
[3] Univ Oradea, Fac Elect Engn & Informat Technol, Oradea, Romania
[4] Karunya Inst Technol & Sci, Elect & Commun Engn, Coimbatore, Tamil Nadu, India
关键词
Spam Content; Machine learning; Deep learning; Natural language processing; Social media analysis; Classification; Text mining; Data mining; OPINION SPAM; FRAMEWORK; NETWORK;
D O I
10.7717/peerj-cs.830
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The presence of spam content in social media is tremendously increasing, and therefore the detection of spam has become vital. The spam contents increase as people extensively use social media, i.e., Facebook, Twitter, YouTube, and E-mail. The time spent by people using social media is overgrowing, especially in the time of the pandemic. Users get a lot of text messages through social media, and they cannot recognize the spam content in these messages. Spam messages contain malicious links, apps, fake accounts, fake news, reviews, rumors, etc. To improve social media security, the detection and control of spam text are essential. This paper presents a detailed survey on the latest developments in spam text detection and classification in social media. The various techniques involved in spam detection and classification involving Machine Learning, Deep Learning, and text-based approaches are discussed in this paper. We also present the challenges encountered in the identification of spam with its control mechanisms and datasets used in existing works involving spam detection.
引用
收藏
页数:28
相关论文
共 104 条
[1]   Spam Email Detection Using Deep Learning Techniques [J].
AbdulNabi, Isra'a ;
Yaseen, Qussai .
12TH INTERNATIONAL CONFERENCE ON AMBIENT SYSTEMS, NETWORKS AND TECHNOLOGIES (ANT) / THE 4TH INTERNATIONAL CONFERENCE ON EMERGING DATA AND INDUSTRY 4.0 (EDI40) / AFFILIATED WORKSHOPS, 2021, 184 :853-858
[2]  
Abiramasundari S, ANN ROMANIAN SOC CEL, V25, P18
[3]   Spam detection on Twitter using a support vector machine and users' features by identifying their interactions [J].
Ahmad, Saleh Beyt Sheikh ;
Rafie, Mahnaz ;
Ghorabie, Seyed Mojtaba .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) :11583-11605
[4]  
Aiyar Shreyas, 2018, Procedia Computer Science, V132, P174, DOI 10.1016/j.procs.2018.05.181
[5]  
Al-Zoubi A., 2018, KNOWL-BASED SYST, V153, DOI DOI 10.1016/j.knosys.2018.04.025
[6]  
Alauthman M., 2020, International Journal of Emerging Trends in Engineering Research, V8, P1979, DOI DOI 10.30534/IJETER/2020/83852020
[7]   Investigating the impact of pre-processing techniques and pre-trained word embeddings in detecting Arabic health information on social media [J].
Albalawi, Yahya ;
Buckley, Jim ;
Nikolov, Nikola S. .
JOURNAL OF BIG DATA, 2021, 8 (01)
[8]   A real-time deep-learning approach for filtering Arabic low-quality content and accounts on Twitter [J].
Alharthi, Reem ;
Alhothali, Areej ;
Moria, Kawthar .
INFORMATION SYSTEMS, 2021, 99
[9]  
Almeida TA, 2012, STUD COMPUT INTELL, V394, P199
[10]  
Alom Zulfikar, 2020, Online Social Networks and Media, V18, P1, DOI 10.1016/j.osnem.2020.100079