Detection of Hate Speech, Racism and Misogyny in Digital Social Networks: Colombian Case Study

被引:0
作者
Moreno-Sandoval, Luis Gabriel [1 ]
Pomares-Quimbaya, Alexandra [1 ]
Barbosa-Sierra, Sergio Andres [1 ]
Pantoja-Rojas, Liliana Maria [2 ]
机构
[1] Pontificia Univ Javeriana, Engn Fac, Bogota 110231, Colombia
[2] Univ Distrital Francisco Jose Caldas, Engn Fac, Bogota 111611, Colombia
关键词
large language models; digital social networks; hate speech detection; sentiment analysis; social network analysis; subjectivity analysis; text classification; TWITTER; CLASSIFICATION;
D O I
10.3390/bdcc8090113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The growing popularity of social networking platforms worldwide has substantially increased the presence of offensive language on these platforms. To date, most of the systems developed to mitigate this challenge focus primarily on English content. However, this issue is a global concern, and therefore, other languages, such as Spanish, are involved. This article addresses the task of identifying hate speech, racism, and misogyny in Spanish within the Colombian context on social networks, and introduces a gold standard dataset specifically developed for this purpose. Indeed, the experiment compares the performance of TLM models from Deep Learning methods, such as BERT, Roberta, XLM, and BETO adjusted to the Colombian slang domain, then compares the best TLM model against a GPT, having a significant impact on achieving more accurate predictions in this task. Finally, this study provides a detailed understanding of the different components used in the system, including the architecture of the models and the selection of functions. The best results show that the BERT model achieves an accuracy of 83.6% for hate speech detection, while the GPT model achieves an accuracy of 90.8% for racism speech and 90.4% for misogyny detection.
引用
收藏
页数:25
相关论文
共 82 条
[1]  
Abro S, 2020, INT J ADV COMPUT SC, V11, P484
[2]  
2023, Arxiv, DOI arXiv:2303.08774
[3]   Detection of hate speech in Arabic tweets using deep learning [J].
Al-Hassan, Areej ;
Al-Dossari, Hmood .
MULTIMEDIA SYSTEMS, 2022, 28 (06) :1963-1974
[4]  
Alvarez- Carmona M.A., 2018, IBEREVAL SEPLN, P74
[5]  
[Anonymous], Semana Magazine: New Campaign against Cyberbullying Launched in Colombia
[6]  
[Anonymous], IBM - CRISP-DM Help Overview
[7]  
[Anonymous], What Is Hate Speech
[8]  
[Anonymous], LibertiesEU Freedom of Expression on Social Media: Filtering Methods, Rights, and Future Perspectives
[9]  
[Anonymous], Royal Spanish Academy Misogyny
[10]  
[Anonymous], Ash Turner How Many Users Does Twitter Have?