Automatic Cyberbullying Detection: a Mexican case in High School and Higher Education students

被引:4
作者
Arce-Ruelas, K., I [1 ]
Alvarez-Xochihua, O. [2 ]
Pelegrin, L. [2 ]
Cardoza-Avendano, L. [1 ]
Gonzalez-Fraga, J. A. [2 ]
机构
[1] Univ Autonoma Baja California, Fac Ingn Arquitectura & Diseno, Mexicali, Baja California, Mexico
[2] Univ Autonoma Baja California, Fac Ciencias, Mexicali, Baja California, Mexico
关键词
Cyberbullying; Radio frequency; Blogs; Videos; IEEE transactions; Deep learning; Computational modeling; Bullying; Machine learning; Social networks;
D O I
10.1109/TLA.2022.9693561
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The social interaction among young students has been partially or totally transformed to mobile-based communication, specifically through the use of social networks. This new communication environment has allowed a more immediate, diverse and massive interaction, offering a faster and more effective situation when carrying out academic and recreational activities. However, this scenario has also promoted the phenomenon of social harassment known as bullying, exponentially increasing its scope and diversifying the types and forms of aggression. Machine learning and natural language processing techniques have been used to create models that detect bullying situations among students, using data corpus from mainly public social networks. However, generally, these data sources are not representative of the social networks commonly used by the students; generating classification models that do not consider the vocabulary used by this social group. This article describes the methodology used to create a representative data corpus of the interaction between Mexican high school and university students, and a comparative analysis on characteristics that influence the quality of the content of a corpus in this domain. In addition, the performance achieved by implementing various machine learning models to identify bullying situations is presented. The best result is reported for the Naive Bayesian classifier (F1-Score of 0.862), performing better than models based on deep learning such as Recurrent (F1-Score of 0.845) and Convolutional (F1-Score of 0.807) Neural Networks.
引用
收藏
页码:770 / 779
页数:10
相关论文
共 44 条
  • [1] Predicting Cyberbullying on Social Media in the Big Data Era Using Machine Learning Algorithms: Review of Literature and Open Challenges
    Al-Garadi, Mohammed Ali
    Hussain, Mohammad Rashid
    Khan, Nawsher
    Murtaza, Ghulam
    Nweke, Henry Friday
    Ali, Ihsan
    Mujtaba, Ghulam
    Chiroma, Haruna
    Khattak, Hasan Ali
    Gani, Abdullah
    [J]. IEEE ACCESS, 2019, 7 : 70701 - 70718
  • [2] Cyberbullying: Concepts, theories, and correlates informing evidence-based best practices for prevention
    Ansary, Nadia S.
    [J]. AGGRESSION AND VIOLENT BEHAVIOR, 2020, 50
  • [3] Aragon M.E., 2019, CEUR WORKSHOP P, V2421, P478
  • [4] Aragon M. E., 2020, IBERLEF SEPLN, P222, DOI DOI 10.29057/MJMR.V8I16
  • [5] Arce, LISTADO GEN PALABRAS
  • [6] Cyberbullying detection on twitter using Big Five and Dark Triad features
    Balakrishnan, Vimala
    Khan, Shahzaib
    Fernandez, Terence
    Arabnia, Hamid R.
    [J]. PERSONALITY AND INDIVIDUAL DIFFERENCES, 2019, 141 : 252 - 257
  • [7] Banerjee V, 2019, INT CONF ADVAN COMPU, P604, DOI [10.1109/icaccs.2019.8728378, 10.1109/ICACCS.2019.8728378]
  • [8] Benavides L. E. C., 2015, REV IBER INVES DES E
  • [9] CEREZO F., 2009, International Journal of Psychology and Psychological Therapy, V9, P367
  • [10] Minority Report: Cyberbullying Prediction on Instagram
    Chelmis, Charalampos
    Yao, Mengfan
    [J]. PROCEEDINGS OF THE 11TH ACM CONFERENCE ON WEB SCIENCE (WEBSCI'19), 2019, : 37 - 45