Using Twitter to Detect Hate Crimes and Their Motivations: The HateMotiv Corpus

被引:7
作者
Alnazzawi, Noha [1 ]
机构
[1] Royal Commiss Jubail & Yanbu, Yanbu Ind Coll, Dept Comp Sci & Engn, Yanbu Industrial City 41912, Saudi Arabia
关键词
text mining; corpus construction; annotation guidelines; hate crime motivation;
D O I
10.3390/data7060069
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the rapidly increasing use of social media platforms, much of our lives is spent online. Despite the great advantages of using social media, unfortunately, the spread of hate, cyberbullying, harassment, and trolling can be very common online. Many extremists use social media platforms to communicate their messages of hatred and spread violence, which may result in serious psychological consequences and even contribute to real-world violence. Thus, the aim of this research was to build the HateMotiv corpus, a freely available dataset that is annotated for types of hate crimes and the motivation behind committing them. The dataset was developed using Twitter as an example of social media platforms and could provide the research community with a very unique, novel, and reliable dataset. The dataset is unique as a consequence of its topic-specific nature and its detailed annotation. The corpus was annotated by two annotators who are experts in annotation based on unified guidelines, so they were able to produce an annotation of a high standard with F-scores for the agreement rate as high as 0.66 and 0.71 for type and motivation labels of hate crimes, respectively.
引用
收藏
页数:10
相关论文
共 36 条
[1]   Building a semantically annotated corpus for chronic disease complications using two document types [J].
Alnazzawi, Noha .
PLOS ONE, 2021, 16 (03)
[2]  
Alnazzawi Noha., 2014, Proceedings of Louhi, V14, P69, DOI DOI 10.3115/V1/W14-1110
[3]  
Alorainy W, 2018, INT CONF MACH LEARN, P581, DOI 10.1109/ICMLC.2018.8527001
[4]  
[Anonymous], TRAINING DATA ML HUM
[5]  
[Anonymous], 2000, P 2 INT C LANG RES E
[6]  
[Anonymous], HASHTAGIFY SEARCH FI
[7]  
Basile Valerio, 2019, P 13 INT WORKSH SEM, DOI [10.18653/v1/S19-2007, DOI 10.18653/V1/S19-2007]
[8]  
Bojarska K., 2018, DYNAMICS HATE SPEECH
[9]   Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making [J].
Burnap, Pete ;
Williams, Matthew L. .
POLICY AND INTERNET, 2015, 7 (02) :223-242
[10]   Us and them: identifying cyber hate on Twitter across multiple protected characteristics [J].
Burnap, Pete ;
Williams, Matthew L. .
EPJ DATA SCIENCE, 2016, 5