Insult Detection in the Turkish Language Through Different Machine Learning Algorithms

被引:0
作者
Ozgen, Kerem [1 ]
Rada, Lavdie [1 ]
机构
[1] Bahcesehir Univ, Fac Engn & Nat Sci, Istanbul, Turkiye
来源
2023 31ST SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU | 2023年
关键词
insult detection; natural language processing; Turkish language; machine learning; offensive speech; hate speech; profane language;
D O I
10.1109/SIU59756.2023.10223909
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this research paper, we propose to use the Turkish Court of Cassation- Yargitay- cases to build a dataset for insult detection tasks and compare machine learning models trained on this dataset. We accumulated studies available in the literature compiling Court of Cassation cases and generated a train and test set for testing machine learning algorithms for insult detection. Although machine learning is not capable of understanding the legal context, cultural background, and the nature of insults or non-insults, it can help identify insults with proper training data created by experts. As far as for the authors knowledge this is the first study to use machine learning for the purpose of automatically distinguishing between insult and non-insult cases within the Turkish justice system. Our research, though its is in its first steps, represents a significant contribution to the field, as it addresses a gap in the existing literature and provides a machine learning approach to improving the efficiency and accuracy of legal decision-making.
引用
收藏
页数:4
相关论文
共 16 条
[1]  
ai.facebook, New progress in using AI to detect harmful content
[2]  
Aslan Ahmet, 2022, Hakaret Sozlugu, V3rd
[3]  
barandogan, Internetten veya Sosyal Medya Uzerinden Hakaret Sucu
[4]  
Beyhan F, 2022, LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, P4177
[5]  
Bingolce F., 2001, Kadin argosu sozlugu
[6]  
Celik A., 2022, 2020 28 SIGN PROC CO
[7]  
Çöltekin Ç, 2020, PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), P6174
[8]  
facebook, How does Facebook use artificial intelligence to moderate content? | Facebook Help Centre
[9]  
fra.europa, 2009, EU Charter of Fundamental Rights
[10]  
github, Imayda/ turkishhatespeechdataset- 1: 1000 adet Turkce tweetten olusan nefret soylemi veri seti