Cyberbullying detection: Utilizing social media features

被引:45
作者
Bozyigit, Alican [1 ]
Utku, Semih [2 ]
Nasibov, Efendi [1 ]
机构
[1] Dokuz Eylul Univ, Dept Comp Sci, Izmir, Turkey
[2] Dokuz Eylul Univ, Dept Comp Engn, Izmir, Turkey
关键词
Cyberbullying detection; Social media analysis; Text mining;
D O I
10.1016/j.eswa.2021.115001
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cyberbullying has become a major problem around the world with the increasing usage of social networks. In this direction, many studies are conducted to detect cyberbullying content automatically. Most of the studies handle this problem using opinion mining approaches that focus on the text. In this study, it is aimed to present the importance of social media attributes in cyberbullying detection. Firstly, a balanced dataset consisting of 5000 labeled contents with many social media features were prepared. Then, the relationship between social media features and cyberbullying were analyzed using the chi-square test. It is seen that some features (e.g., sender followers) are strongly related to online bullying events according to the test results. For instance, users that have more followers on social networks are disinclined to post online bullying content. Then, machine learning algorithms experimented on two different variants of the prepared datasets. The first variant includes only textual features whereas the second variant consists of the determined social media features and textual features. It is observed that each experimented machine learning algorithm give more successful prediction performance on the variant containing social media features. The obtained results motivate doing further research about social media characteristics in cyberbullying.
引用
收藏
页数:12
相关论文
共 45 条
[1]  
Al Shalabi L., 2006, Journal of Computer Sciences, V2, P735, DOI 10.3844/jcssp.2006.735.739
[2]   Cybercrime detection in online communications: The experimental case of cyberbullying detection in the Twitter network [J].
Al-garadr, Mohammed Ali ;
Varathan, Kasturi Dewi ;
Ravana, Sri Devi .
COMPUTERS IN HUMAN BEHAVIOR, 2016, 63 :433-443
[3]  
[Anonymous], 2004, Measuring dialect pronunciation differences using Levenshtein distance (Groningen Dissertations in Linguistics 46)
[4]  
[Anonymous], 2006, P 23 INT C MACH LEAR, DOI 10.1145/1143844.1143874
[5]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[6]  
Bozyigit Alican, 2019, 2019 4th International Conference on Computer Science and Engineering (UBMK), P520, DOI 10.1109/UBMK.2019.8907118
[7]  
Bozyigit A, 2020, **DATA OBJECT**, DOI 10.17632/pgfk7h4367.1
[8]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[9]  
Chaffey D, 2020, Global Social Media Research Summary August 2020
[10]  
Cheng L., 2019, PROC 28 INT JOINT C, P5829, DOI DOI 10.24963/IJCAI.2019/808