Multi-class hate speech detection in the Norwegian language using FAST-RNN and multilingual fine-tuned transformers

被引：12

作者：

Hashmi, Ehtesham ^{[1
]}

Yayilgan, Sule Yildirim ^{[1
]}

机构：

[1] Norwegian Univ Sci & Technol NTNU, Dept Informat Secur & Commun Technol IIK, Teknol Vegen 22, N-2815 Gjovik, Innlandet, Norway

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2024年 / 10卷 / 03期

关键词：

Hate speech; Norwegian language; Natural language processing; Deep Learning; Transformers; Interpretability modeling;

D O I：

10.1007/s40747-024-01392-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The growth of social networks has provided a platform for individuals with prejudiced views, allowing them to spread hate speech and target others based on their gender, ethnicity, religion, or sexual orientation. While positive interactions within diverse communities can considerably enhance confidence, it is critical to recognize that negative comments can hurt people's reputations and well-being. This emergence emphasizes the need for more diligent monitoring and robust policies on these platforms to protect individuals from such discriminatory and harmful behavior. Hate speech is often characterized as an intentional act of aggression directed at a specific group, typically meant to harm or marginalize them based on certain aspects of their identity. Most of the research related to hate speech has been conducted in resource-aware languages like English, Spanish, and French. However, low-resource European languages, such as Irish, Norwegian, Portuguese, Polish, Slovak, and many South Asian, present challenges due to limited linguistic resources, making information extraction labor-intensive. In this study, we present deep neural networks with FastText word embeddings using regularization methods for multi-class hate speech detection in the Norwegian language, along with the implementation of multilingual transformer-based models with hyperparameter tuning and generative configuration. FastText outperformed other deep learning models when stacked with Bidirectional LSTM and GRU, resulting in the FAST-RNN model. In the concluding phase, we compare our results with the state-of-the-art and perform interpretability modeling using Local Interpretable Model-Agnostic Explanations to achieve a more comprehensive understanding of the model's decision-making mechanisms.

引用

页码：4535 / 4556

页数：22

共 75 条

[1] Comparing Bag of Words and TF-IDF with different models for hate speech detection from live tweets
Akuma S.
Lubem T.
Adom I.T.
[J]. International Journal of Information Technology, 2022, 14 (7) : 3629 - 3635
[2] Hate speech detection on Twitter using transfer learning
Ali, Raza
Farooq, Umar
Arshad, Umair
Shahzad, Waseem
Beg, Mirza Omer
[J]. COMPUTER SPEECH AND LANGUAGE, 2022, 74
[3] AndreassenSM SeimGT, 2020, THESIS NTNU
[4] Detecting misogyny in Spanish tweets. An approach based on linguistics features and word embeddings
Antonio Garcia-Diaz, Jose
Canovas-Garcia, Mar
Colomo-Palacios, Ricardo
Valencia-Garcia, Rafael
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2021, 114 : 506 - 518
[5] Aswad Evelyn, 2016, Columbia Human Rights Law Review, V1, P1, DOI DOI 10.2139/SSRN.2829175
[6] Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection
Awal, Md Rabiul
Lee, Roy Ka-Wei
Tanwar, Eshaan
Garg, Tanmay
Chakraborty, Tanmoy
[J]. IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2024, 11 (01) : 1086 - 1095
[7] A probabilistic clustering model for hate speech classification in twitter
Ayo, Femi Emmanuel
Folorunso, Olusegun
Ibharalu, Friday Thomas
Osinuga, Idowu Ademola
Abayomi-Alli, Adebayo
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2021, 173
[8] Impact of Data Augmentation on Hate Speech Detection
Batarfi, Hanan A.
Alsaedi, Olaa A.
Wali, Arwa M.
Jamal, Amani T.
[J]. INNOVATIONS FOR COMMUNITY SERVICES, I4CS 2023, 2023, 1876 : 187 - 199
[9] Biecek P., 2021, Informa, P107, DOI DOI 10.1201/9780429027192-11
[10] BIGOULAEVA I, 2023, LANG RESOUR EVALUAT, P1

← 1 2 3 4 5 6 7 8 →