A Large Language Model Approach to Detect Hate Speech in Political Discourse Using Multiple Language Corpora

被引:2
作者
de Oliveira, Aillkeen Bezerra [1 ]
Baptista, Claudio de Souza [1 ]
Firmino, Anderson Almeida [1 ]
de Paiva, Anselmo Cardoso [2 ]
机构
[1] Univ Fed Campina Grande, Campina Grande, Paraiba, Brazil
[2] Univ Fed Maranhao, Sao Luis, Maranhao, Brazil
来源
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024 | 2024年
关键词
Hate Speech; Large Language Model; Cross-Lingual Learning; Machine Learning; Natural Language Processing;
D O I
10.1145/3605098.3635964
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this era of unprecedented digital connectivity and interactions, the issue of hate speech has become a focal point in societal discussions. The rise of digital communication platforms has fundamentally transformed how hate speech spreads. Online social media and messaging apps have rapidly disseminated hate speech, exacerbated by the internet's anonymity. Computational technology has emerged as a valuable tool for identifying and mitigating hate speech on social media. In this work, we employed five distinct corpora representing the English, Italian, Filipino, German, and Turkish languages. We propose employing a Large Language Model (GPT-3) enhanced with Cross-Lingual Learning to improve hate speech detection in English and Italian. Our investigation employs a strategy, namely JL/CL+, which combines two strategies: Joint Learning (JL) and Cascade Learning (CL). Even using data with lexical disparities, our findings demonstrate substantial success, yielding an F1-score of 96.58% for English and 92.05% for Italian languages.
引用
收藏
页码:1461 / 1468
页数:8
相关论文
共 33 条
  • [1] Asai A, 2023, Arxiv, DOI arXiv:2305.14857
  • [2] Bigoulaeva I., 2021, P 1 WORKSH LANG TECH, P15
  • [3] Cabasag Neil Vicente, 2019, In Philippine Computing Journal, V14
  • [4] A Multilingual Evaluation for Online Hate Speech Detection
    Corazza, Michele
    Menini, Stefano
    Cabrio, Elena
    Tonelli, Sara
    Villata, Serena
    [J]. ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2020, 20 (02)
  • [5] de Oliveira A., 2023, P 25 INT C ENT INIF, V1, P374, DOI [10.5220/0011851800003467, DOI 10.5220/0011851800003467]
  • [6] De Smedt T., 2018, P 6 C COMP MED COMM
  • [7] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [8] Hate versus politics: detection of hate against policy makers in Italian tweets
    Armend Duzha
    Cristiano Casadei
    Michael Tosi
    Fabio Celli
    [J]. SN Social Sciences, 1 (9):
  • [9] Improving hate speech detection using Cross-Lingual Learning
    Firmino, Anderson Almeida
    Baptista, Claudio de Souza
    de Paiva, Anselmo Cardoso
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
  • [10] A Survey on Automatic Detection of Hate Speech in Text
    Fortuna, Paula
    Nunes, Sergio
    [J]. ACM COMPUTING SURVEYS, 2018, 51 (04)