Handling Bias in Toxic Speech Detection: A Survey

被引:18
|
作者
Garg, Tanmay [1 ]
Masud, Sarah [1 ]
Suresh, Tharun [1 ]
Chakraborty, Tanmoy [1 ]
机构
[1] IIIT Delhi, Delhi, India
关键词
Toxic speech; hate speech; social networks; unintended bias; bias mitigation; bias shift; COGNITION;
D O I
10.1145/3580494
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Detecting online toxicity has always been a challenge due to its inherent subjectivity. Factors such as the context, geography, socio-political climate, and background of the producers and consumers of the posts play a crucial role in determining if the content can be flagged as toxic. Adoption of automated toxicity detection models in production can thus lead to a sidelining of the various groups they aim to help in the first place. It has piqued researchers' interest in examining unintended biases and their mitigation. Due to the nascent and multi-faceted nature of the work, complete literature is chaotic in its terminologies, techniques, and findings. In this article, we put together a systematic study of the limitations and challenges of existing methods for mitigating bias in toxicity detection. We look closely at proposed methods for evaluating and mitigating bias in toxic speech detection. To examine the limitations of existing methods, we also conduct a case study to introduce the concept of bias shift due to knowledge-based bias mitigation. The survey concludes with an overview of the critical challenges, research gaps, and future directions. While reducing toxicity on online platforms continues to be an active area of research, a systematic study of various biases and their mitigation strategies will help the research community produce robust and fair models.(1)
引用
收藏
页数:32
相关论文
共 50 条
  • [1] Bias in Hate Speech and Toxicity Detection
    Lobo, Paula Reyero
    PROCEEDINGS OF THE 2022 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2022, 2022, : 910 - 910
  • [2] The effect of gender bias on hate speech detection
    Sahinuc, Furkan
    Yilmaz, Eyup Halit
    Toraman, Cagri
    Koc, Aykut
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (04) : 1591 - 1597
  • [3] The effect of gender bias on hate speech detection
    Furkan Şahinuç
    Eyup Halit Yilmaz
    Cagri Toraman
    Aykut Koç
    Signal, Image and Video Processing, 2023, 17 : 1591 - 1597
  • [4] Systematic keyword and bias analyses in hate speech detection
    Sarracen, Gretel Liz De la Pella
    Rosso, Paolo
    INFORMATION PROCESSING & MANAGEMENT, 2023, 60 (05)
  • [5] A Survey on Automatic Detection of Hate Speech in Text
    Fortuna, Paula
    Nunes, Sergio
    ACM COMPUTING SURVEYS, 2018, 51 (04)
  • [6] Hate speech detection in the Bengali language: a comprehensive survey
    Al Maruf, Abdullah
    Abidin, Ahmad Jainul
    Haque, Md. Mahmudul
    Jiyad, Zakaria Masud
    Golder, Aditi
    Alubady, Raaid
    Aung, Zeyar
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [7] Automatic Hate Speech Detection on Social Media: A Brief Survey
    Alrehili, Ahlam
    2019 IEEE/ACS 16TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA 2019), 2019,
  • [8] StereoHate: Toward identifying stereotypical bias and target group in hate speech detection
    Maity, Krishanu
    Ghosh, Nilabja
    Jain, Raghav
    Saha, Sriparna
    Bhattacharyya, Pushpak
    NATURAL LANGUAGE PROCESSING, 2025, 31 (02): : 415 - 434
  • [9] Framework for Detecting Toxic Speech Using BERT and Deep Learning
    Barai, Ankit
    Jain, Pooja
    Kumar, Tapan
    PROCEEDINGS OF 27TH INTERNATIONAL SYMPOSIUM ON FRONTIERS OF RESEARCH IN SPEECH AND MUSIC, FRSM 2023, 2024, 1455 : 3 - 17
  • [10] Stereotypical Bias Removal for Hate Speech Detection Task using Knowledge-based Generalizations
    Badjatiya, Pinkesh
    Gupta, Manish
    Varma, Vasudeva
    WEB CONFERENCE 2019: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2019), 2019, : 49 - 59