Uncovering the Root of Hate Speech: A Dataset for Identifying Hate Instigating Speech

被引:0
作者
Park, Hyoungjun [1 ]
Shim, Ho Sung [1 ]
Lee, Kyuhan [1 ]
机构
[1] Korea Univ Business Sch, Dept Informat Syst, Seoul, South Korea
来源
FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023 | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While many prior studies have applied computational approaches, such as machine learning, to detect and moderate hate speech, only scant attention has been paid to the task of identifying the underlying cause of hate speech. In this study, we introduce the concept of hate instigating speech, which refers to a specific type of textual posts on online platforms that stimulate or provoke others to engage in hate speech. The identification of hate instigating speech carries substantial practical implications for effective hate speech moderation. Rather than targeting individual instances of hate speech, by focusing on their roots, i.e., hate instigating speech, it becomes possible to significantly reduce the volume of content that requires review for moderation. Additionally, targeting hate instigating speech enables early prevention of the spread and propagation of hate speech, further enhancing the effectiveness of moderation efforts. However, several challenges hinder researchers from addressing the identification of hate instigating speech. First, there is a lack of comprehensive datasets specifically annotated for hate instigation, making it difficult to train and evaluate computational models effectively. Second, the subtle and nuanced nature of hate instigating speech (e.g., seemingly non-offensive texts serve as catalysts for triggering hate speech) makes it difficult to apply off-the-shelf machine learning models to the problem. To address these challenges, in this study, we have developed and released a multilingual dataset specifically designed for the task of identifying hate instigating speech. Specifically, it encompasses both English and Korean, allowing for a comprehensive examination of hate instigating speech across different linguistic contexts. We have applied existing machine learning models to our dataset and the results demonstrate that the extant models alone are insufficient for effectively detecting hate instigating speech. This finding highlights the need for further attention from the academic community to address this specific challenge. We expect our study and dataset to inspire researchers to explore innovative methods that can enhance the accuracy of hate instigating speech detection, ultimately contributing to more effective moderation and prevention of hate speech propagation online.
引用
收藏
页码:6236 / 6245
页数:10
相关论文
共 34 条
  • [1] Ahammed S, 2019, PROCEEDINGS OF THE 2019 8TH INTERNATIONAL CONFERENCE ON SYSTEM MODELING & ADVANCEMENT IN RESEARCH TRENDS (SMART-2019), P317, DOI [10.1109/smart46866.2019.9117214, 10.1109/SMART46866.2019.9117214]
  • [2] Anderson L., 2023, STANFORD ENCY PHILOS
  • [3] Assimakopoulos S., 2020, ARXIV
  • [4] Basile V., 2019, P 13 INT WORKSH SEM, P54, DOI DOI 10.18653/V1/S19-2007
  • [5] Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making
    Burnap, Pete
    Williams, Matthew L.
    [J]. POLICY AND INTERNET, 2015, 7 (02): : 223 - 242
  • [6] Mean Birds: Detecting Aggression and Bullying on Twitter
    Chatzakou, Despoina
    Kourtellis, Nicolas
    Blackburn, Jeremy
    De Cristofaro, Emiliano
    Stringhini, Gianluca
    Vakali, Athena
    [J]. PROCEEDINGS OF THE 2017 ACM WEB SCIENCE CONFERENCE (WEBSCI '17), 2017, : 13 - 22
  • [7] Chiril P., 2022, COGN COMPUT, P1
  • [8] Corazza M., 2019, P 27 ACM SIGKDD C KN
  • [9] Would Your Tweet Invoke Hate on the Fly? Forecasting Hate Intensity of Reply Threads on Twitter
    Dahiya, Snehil
    Sharma, Shalini
    Sahnan, Dhruv
    Goel, Vasu
    Chouzenoux, Emilie
    Elvira, Victor
    Majumdar, Angshul
    Bandhakavi, Anil
    Chakraborty, Tanmoy
    [J]. KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 2732 - 2742
  • [10] Davidson T., 2017, Proceedings of the International AAAI Conference on Web and Social Media, V11, P512, DOI [https://doi.org/10.1609/icwsm.v11i1.14955, DOI 10.1609/ICWSM.V11I1.14955]