Handling Bias in Toxic Speech Detection: A Survey

被引:18
|
作者
Garg, Tanmay [1 ]
Masud, Sarah [1 ]
Suresh, Tharun [1 ]
Chakraborty, Tanmoy [1 ]
机构
[1] IIIT Delhi, Delhi, India
关键词
Toxic speech; hate speech; social networks; unintended bias; bias mitigation; bias shift; COGNITION;
D O I
10.1145/3580494
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Detecting online toxicity has always been a challenge due to its inherent subjectivity. Factors such as the context, geography, socio-political climate, and background of the producers and consumers of the posts play a crucial role in determining if the content can be flagged as toxic. Adoption of automated toxicity detection models in production can thus lead to a sidelining of the various groups they aim to help in the first place. It has piqued researchers' interest in examining unintended biases and their mitigation. Due to the nascent and multi-faceted nature of the work, complete literature is chaotic in its terminologies, techniques, and findings. In this article, we put together a systematic study of the limitations and challenges of existing methods for mitigating bias in toxicity detection. We look closely at proposed methods for evaluating and mitigating bias in toxic speech detection. To examine the limitations of existing methods, we also conduct a case study to introduce the concept of bias shift due to knowledge-based bias mitigation. The survey concludes with an overview of the critical challenges, research gaps, and future directions. While reducing toxicity on online platforms continues to be an active area of research, a systematic study of various biases and their mitigation strategies will help the research community produce robust and fair models.(1)
引用
收藏
页数:32
相关论文
共 50 条
  • [31] Automated Hate Speech Detection on Twitter
    Koushik, Garima
    Rajeswari, K.
    Muthusamy, Suresh Kannan
    2019 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION, CONTROL AND AUTOMATION (ICCUBEA), 2019,
  • [32] Handling Imbalance Issue in Hate Speech Classification using Sampling-based Methods
    Rathpisey, Heng
    Adji, Teguh Bharata
    2019 5TH INTERNATIONAL CONFERENCE ON SCIENCE ININFORMATION TECHNOLOGY (ICSITECH): EMBRACING INDUSTRY 4.0 - TOWARDS INNOVATION IN CYBER PHYSICAL SYSTEM, 2019, : 193 - 198
  • [33] Polarization and hate speech with gender bias associated with politics: analysis of interactions on Twitter
    Blanco-Alfonso, Ignacio
    Rodriguez-Fernandez, Leticia
    Arce-Garcia, Sergio
    REVISTA DE COMUNICACION-PERU, 2022, 21 (02): : 33 - 50
  • [34] Multilingual Twitter Corpus and Baselines for Evaluating Demographic Bias in Hate Speech Recognition
    Huang, Xiaolei
    Xing, Linzi
    Dernoncourt, Franck
    Paul, Michael J.
    PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 1440 - 1448
  • [35] Hate Speech is not Free Speech: Explainable Machine Learning for Hate Speech Detection in Code-Mixed Languages
    Yadav, Sargam
    Kaushik, Abhishek
    McDaid, Kevin
    2023 IEEE INTERNATIONAL SYMPOSIUM ON TECHNOLOGY AND SOCIETY, ISTAS, 2023,
  • [36] Beyond Hostile Linguistic Cues: The Gravity of Online Milieu for Hate Speech Detection in Arabic
    Chowdhury, Arijit Ghosh
    Didolkar, Aniket
    Sawhney, Ramit
    Shah, Rajiv Ratn
    PROCEEDINGS OF THE 30TH ACM CONFERENCE ON HYPERTEXT AND SOCIAL MEDIA (HT '19), 2019, : 285 - 286
  • [37] Improved Ant Lion Optimizer with Deep Learning Driven Arabic Hate Speech Detection
    Motwakel A.
    Al-Onazi B.B.
    Alzahrani J.S.
    Alazwari S.
    Othman M.
    Zamani A.S.
    Yaseen I.
    Abdelmageed A.A.
    Computer Systems Science and Engineering, 2023, 46 (03): : 3321 - 3338
  • [38] Explainable hate speech detection using LIME
    Joan L. Imbwaga
    Nagaratna B. Chittaragi
    Shashidhar G. Koolagudi
    International Journal of Speech Technology, 2024, 27 (3) : 793 - 815
  • [39] Local Community Detection: A Survey
    Baltsou, Georgia
    Christopoulos, Konstantinos
    Tsichlas, Konstantinos
    IEEE ACCESS, 2022, 10 : 110701 - 110726
  • [40] Enhancing hate speech detection with user characteristics
    Raut, Rohan
    Spezzano, Francesca
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (04) : 445 - 455