A Large Language Model Approach to Detect Hate Speech in Political Discourse Using Multiple Language Corpora

被引：2

作者：

de Oliveira, Aillkeen Bezerra ^{[1
]}

Baptista, Claudio de Souza ^{[1
]}

Firmino, Anderson Almeida ^{[1
]}

de Paiva, Anselmo Cardoso ^{[2
]}

机构：

[1] Univ Fed Campina Grande, Campina Grande, Paraiba, Brazil

[2] Univ Fed Maranhao, Sao Luis, Maranhao, Brazil

来源：

39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024 | 2024年

关键词：

Hate Speech; Large Language Model; Cross-Lingual Learning; Machine Learning; Natural Language Processing;

D O I：

10.1145/3605098.3635964

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this era of unprecedented digital connectivity and interactions, the issue of hate speech has become a focal point in societal discussions. The rise of digital communication platforms has fundamentally transformed how hate speech spreads. Online social media and messaging apps have rapidly disseminated hate speech, exacerbated by the internet's anonymity. Computational technology has emerged as a valuable tool for identifying and mitigating hate speech on social media. In this work, we employed five distinct corpora representing the English, Italian, Filipino, German, and Turkish languages. We propose employing a Large Language Model (GPT-3) enhanced with Cross-Lingual Learning to improve hate speech detection in English and Italian. Our investigation employs a strategy, namely JL/CL+, which combines two strategies: Joint Learning (JL) and Cascade Learning (CL). Even using data with lexical disparities, our findings demonstrate substantial success, yielding an F1-score of 96.58% for English and 92.05% for Italian languages.

引用

页码：1461 / 1468

页数：8

共 33 条

[1] Asai A, 2023, Arxiv, DOI arXiv:2305.14857
[2] Bigoulaeva I., 2021, P 1 WORKSH LANG TECH, P15
[3] Cabasag Neil Vicente, 2019, In Philippine Computing Journal, V14
[4] A Multilingual Evaluation for Online Hate Speech Detection
Corazza, Michele
Menini, Stefano
Cabrio, Elena
Tonelli, Sara
Villata, Serena
[J]. ACM TRANSACTIONS ON INTERNET TECHNOLOGY, 2020, 20 (02)
[5] de Oliveira A., 2023, P 25 INT C ENT INIF, V1, P374, DOI [10.5220/0011851800003467, DOI 10.5220/0011851800003467]
[6] De Smedt T., 2018, P 6 C COMP MED COMM
[7] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[8] Hate versus politics: detection of hate against policy makers in Italian tweets
Armend Duzha
Cristiano Casadei
Michael Tosi
Fabio Celli
[J]. SN Social Sciences, 1 (9):
[9] Improving hate speech detection using Cross-Lingual Learning
Firmino, Anderson Almeida
Baptista, Claudio de Souza
de Paiva, Anselmo Cardoso
[J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 235
[10] A Survey on Automatic Detection of Hate Speech in Text
Fortuna, Paula
Nunes, Sergio
[J]. ACM COMPUTING SURVEYS, 2018, 51 (04)

← 1 2 3 4 →