Multimodal hate speech detection: a novel deep learning framework for multilingual text and images

被引:0
作者
Saddozai, Furqan Khan [1 ]
Badri, Sahar K. [2 ]
Alghazzawi, Daniyal [2 ]
Khattak, Asad [3 ]
Asghar, Muhammad Zubair [1 ]
机构
[1] Gomal Research Institute of Computing, Faculty of Computing, Gomal University, KP, D.I.Khan
[2] Information Systems Department, Faculty of Computing and Information Technology, King Abdul Aziz University, Jeddah
[3] College of Technological Innovation, Zayed University, Abu Dhabi Campus, Abu Dhabi
关键词
BiLSTM; Deep learning; EfficientNetB1; Hate speech; Image; Multilingual; Multimodal; Urdu-English;
D O I
10.7717/peerj-cs.2801
中图分类号
学科分类号
摘要
The rapid proliferation of social media platforms has facilitated the expression of opinions but also enabled the spread of hate speech. Detecting multimodal hate speech in low-resource multilingual contexts poses significant challenges. This study presents a deep learning framework that integrates bidirectional long short-term memory (BiLSTM) and EfficientNetB1 to classify hate speech in Urdu-English tweets, leveraging both text and image modalities. We introduce multimodal multilingual hate speech (MMHS11K), a manually annotated dataset comprising 11,000 multimodal tweets. Using an early fusion strategy, text and image features were combined for classification. Experimental results demonstrate that the BiLSTM+EfficientNetB1 model outperforms unimodal and baseline multimodal approaches, achieving an F1-score of 81.2% for Urdu tweets and 75.5% for English tweets. This research addresses critical gaps in multilingual and multimodal hate speech detection, offering a foundation for future advancements. © 2025 Saddozai et al.
引用
收藏
相关论文
共 45 条
  • [1] Abro S, Shaikh S, Khand ZH, Zafar A, Khan S, Mujtaba G., Automatic hate speech detection using machine learning: a comparative study, International Journal of Advanced Computer Science and Applications, 11, 8, (2020)
  • [2] Al-Hassan A, Al-Dossari H., Detection of hate speech in Arabic tweets using deep learning, Multi Media Systems, 28, pp. 1963-1974, (2022)
  • [3] Ali M, Muhammad A, Asad M, Sajawal M, Alexopoulos C, Charalabidis Y., Towards Perso-Arabic Urdu language hate detection using machine learning: a comparative study based on a large dataset and time-complexity, Proceedings of the 26th Pan-Hellenic Conference on Informatics, pp. 317-321, (2022)
  • [4] Arshad MU, Ali R, Beg MO, Shahzad W., UHated: hate speech detection in Urdu language using transfer learning, Language Resources and Evaluation, 57, 2, pp. 1-20, (2023)
  • [5] Aziz S, Sarfraz MS, Usman M, Aftab MU, Rauf HT., Geo-spatial mapping of hate speech prediction in Roman Urdu, Mathematics, 11, 4, (2023)
  • [6] Bilal M, Khan A, Jan S, Musa S, Ali S., Roman Urdu hate speech detection using transformer-based model for cyber security applications, Sensors, 23, 8, (2023)
  • [7] Blandfort P, Patton DU, Frey WR, Karaman S, Bhargava S, Lee FT, Chang SF., Multimodal social media analysis for gang violence prevention, Proceedings of the International AAAI Conference on Web and Social Media, 13, pp. 114-124, (2019)
  • [8] Bojanowski P, Grave E, Joulin A, Mikolov T., Enriching word vectors with sub word information, Transactions of the Association for Computational Linguistics, 5, pp. 135-146, (2017)
  • [9] Chen H, McKeever S, Delany SJ., A comparison of classical versus deep learning techniques for abusive content detection on social media sites, Social Informatics: 10th International Conference, Soc Info 2018, St. Petersburg, Russia, September 25–28, 2018, Proceedings, Part I, 10, pp. 117-133, (2018)
  • [10] Chhabra A, Vishwakarma DK., A literature survey on multimodal and multilingual automatic hate speech identification, Multimedia Systems, 29, 3, pp. 1203-1230, (2023)