Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

被引:12
作者
Hashmi, Ehtesham [1 ]
Yayilgan, Sule Yildirim [1 ]
机构
[1] Norwegian Univ Sci & Technol NTNU, Dept Informat Secur & Commun Technol IIK, Teknol Vegen 22, N-2815 Gjovik, Innlandet, Norway
关键词
Hate speech; Norwegian language; Natural language processing; Deep Learning; Transformers; Interpretability modeling;
D O I
10.1007/s40747-024-01392-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The growth of social networks has provided a platform for individuals with prejudiced views, allowing them to spread hate speech and target others based on their gender, ethnicity, religion, or sexual orientation. While positive interactions within diverse communities can considerably enhance confidence, it is critical to recognize that negative comments can hurt people's reputations and well-being. This emergence emphasizes the need for more diligent monitoring and robust policies on these platforms to protect individuals from such discriminatory and harmful behavior. Hate speech is often characterized as an intentional act of aggression directed at a specific group, typically meant to harm or marginalize them based on certain aspects of their identity. Most of the research related to hate speech has been conducted in resource-aware languages like English, Spanish, and French. However, low-resource European languages, such as Irish, Norwegian, Portuguese, Polish, Slovak, and many South Asian, present challenges due to limited linguistic resources, making information extraction labor-intensive. In this study, we present deep neural networks with FastText word embeddings using regularization methods for multi-class hate speech detection in the Norwegian language, along with the implementation of multilingual transformer-based models with hyperparameter tuning and generative configuration. FastText outperformed other deep learning models when stacked with Bidirectional LSTM and GRU, resulting in the FAST-RNN model. In the concluding phase, we compare our results with the state-of-the-art and perform interpretability modeling using Local Interpretable Model-Agnostic Explanations to achieve a more comprehensive understanding of the model's decision-making mechanisms.
引用
收藏
页码:4535 / 4556
页数:22
相关论文
共 75 条
  • [1] Comparing Bag of Words and TF-IDF with different models for hate speech detection from live tweets
    Akuma S.
    Lubem T.
    Adom I.T.
    [J]. International Journal of Information Technology, 2022, 14 (7) : 3629 - 3635
  • [2] Hate speech detection on Twitter using transfer learning
    Ali, Raza
    Farooq, Umar
    Arshad, Umair
    Shahzad, Waseem
    Beg, Mirza Omer
    [J]. COMPUTER SPEECH AND LANGUAGE, 2022, 74
  • [3] AndreassenSM SeimGT, 2020, THESIS NTNU
  • [4] Detecting misogyny in Spanish tweets. An approach based on linguistics features and word embeddings
    Antonio Garcia-Diaz, Jose
    Canovas-Garcia, Mar
    Colomo-Palacios, Ricardo
    Valencia-Garcia, Rafael
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 114 : 506 - 518
  • [5] Aswad Evelyn, 2016, Columbia Human Rights Law Review, V1, P1, DOI DOI 10.2139/SSRN.2829175
  • [6] Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection
    Awal, Md Rabiul
    Lee, Roy Ka-Wei
    Tanwar, Eshaan
    Garg, Tanmay
    Chakraborty, Tanmoy
    [J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (01) : 1086 - 1095
  • [7] A probabilistic clustering model for hate speech classification in twitter
    Ayo, Femi Emmanuel
    Folorunso, Olusegun
    Ibharalu, Friday Thomas
    Osinuga, Idowu Ademola
    Abayomi-Alli, Adebayo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 173
  • [8] Impact of Data Augmentation on Hate Speech Detection
    Batarfi, Hanan A.
    Alsaedi, Olaa A.
    Wali, Arwa M.
    Jamal, Amani T.
    [J]. INNOVATIONS FOR COMMUNITY SERVICES, I4CS 2023, 2023, 1876 : 187 - 199
  • [9] Biecek P., 2021, Informa, P107, DOI DOI 10.1201/9780429027192-11
  • [10] BIGOULAEVA I, 2023, LANG RESOUR EVALUAT, P1