Handling Bias in Toxic Speech Detection: A Survey

被引:18
|
作者
Garg, Tanmay [1 ]
Masud, Sarah [1 ]
Suresh, Tharun [1 ]
Chakraborty, Tanmoy [1 ]
机构
[1] IIIT Delhi, Delhi, India
关键词
Toxic speech; hate speech; social networks; unintended bias; bias mitigation; bias shift; COGNITION;
D O I
10.1145/3580494
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Detecting online toxicity has always been a challenge due to its inherent subjectivity. Factors such as the context, geography, socio-political climate, and background of the producers and consumers of the posts play a crucial role in determining if the content can be flagged as toxic. Adoption of automated toxicity detection models in production can thus lead to a sidelining of the various groups they aim to help in the first place. It has piqued researchers' interest in examining unintended biases and their mitigation. Due to the nascent and multi-faceted nature of the work, complete literature is chaotic in its terminologies, techniques, and findings. In this article, we put together a systematic study of the limitations and challenges of existing methods for mitigating bias in toxicity detection. We look closely at proposed methods for evaluating and mitigating bias in toxic speech detection. To examine the limitations of existing methods, we also conduct a case study to introduce the concept of bias shift due to knowledge-based bias mitigation. The survey concludes with an overview of the critical challenges, research gaps, and future directions. While reducing toxicity on online platforms continues to be an active area of research, a systematic study of various biases and their mitigation strategies will help the research community produce robust and fair models.(1)
引用
收藏
页数:32
相关论文
共 50 条
  • [41] Is hate speech detection the solution the world wants?
    Parker, Sara
    Ruths, Derek
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (10)
  • [42] Visual Relationship Detection: A Survey
    Cheng, Jun
    Wang, Lei
    Wu, Jiaji
    Hu, Xiping
    Jeon, Gwanggil
    Tao, Dacheng
    Zhou, Mengchu
    IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (08) : 8453 - 8466
  • [43] Toxic language detection: A systematic review of Arabic datasets
    Bensalem, Imene
    Rosso, Paolo
    Zitouni, Hanane
    EXPERT SYSTEMS, 2024, 41 (08)
  • [44] Hate speech detection with ADHAR: a multi-dialectal hate speech corpus in Arabic
    Charfi, Anis
    Besghaier, Mabrouka
    Akasheh, Raghda
    Atalla, Andria
    Zaghouani, Wajdi
    FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2024, 7
  • [45] A literature survey on multimodal and multilingual automatic hate speech identification
    Chhabra, Anusha
    Vishwakarma, Dinesh Kumar
    MULTIMEDIA SYSTEMS, 2023, 29 (03) : 1203 - 1230
  • [46] A literature survey on multimodal and multilingual automatic hate speech identification
    Anusha Chhabra
    Dinesh Kumar Vishwakarma
    Multimedia Systems, 2023, 29 : 1203 - 1230
  • [47] Bias Mitigation for Toxicity Detection via Sequential Decisions
    Cheng, Lu
    Mosallanezhad, Ahmadreza
    Silva, Yasin N.
    Hall, Deborah L.
    Liu, Huan
    PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22), 2022, : 1750 - 1760
  • [48] The Hate Speech Towards the Female Candidates for the Community of Madrid 2023, Gender Bias, and Virulence
    Zamora-Martinez, Patricia
    Gascon-Vera, Patricia
    Gomez-Garcia, Salvador
    REVISTA ICONO 14-REVISTA CIENTIFICA DE COMUNICACION Y TECNOLOGIAS, 2024, 22 (01):
  • [49] Doctors for the Truth: Echo Chambers of Disinformation, Hate Speech, and Authority Bias on Social Media
    Milhazes-Cunha, Joana
    Oliveira, Luciana
    SOCIETIES, 2023, 13 (10):
  • [50] HateTune: Tunisian Dialect Hate Speech Detection Dataset
    Kharrat, Ons
    Mohamed, Fatma Alzahra
    Mtimet, Ikram
    Benamor, Nour
    Fourati, Chayma
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2023, PT I, 2025, 2339 : 63 - 73