Detecting Arabic Cyberbullying Tweets Using Machine Learning

被引:22
作者
Alduailaj, Alanoud Mohammed [1 ]
Belghith, Aymen [1 ]
机构
[1] Saudi Elect Univ, Coll Comp & Informat, Abu Bakr St,POB 93499, Riyadh 11673, Saudi Arabia
来源
MACHINE LEARNING AND KNOWLEDGE EXTRACTION | 2023年 / 5卷 / 01期
关键词
cyberbullying; classification; detection; machine learning (ML); Support Vector Machine (SVM); Arabic social media;
D O I
10.3390/make5010003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The advancement of technology has paved the way for a new type of bullying, which often leads to negative stigma in the social setting. Cyberbullying is a cybercrime wherein one individual becomes the target of harassment and hatred. It has recently become more prevalent due to a rise in the usage of social media platforms, and, in some severe situations, it has even led to victims' suicides. In the literature, several cyberbullying detection methods are proposed, but they are mainly focused on word-based data and user account attributes. Furthermore, most of them are related to the English language. Meanwhile, only a few papers have studied cyberbullying detection in Arabic social media platforms. This paper, therefore, aims to use machine learning in the Arabic language for automatic cyberbullying detection. The proposed mechanism identifies cyberbullying using the Support Vector Machine (SVM) classifier algorithm by using a real dataset obtained from YouTube and Twitter to train and test the classifier. Moreover, we include the Farasa tool to overcome text limitations and improve the detection of bullying attacks.
引用
收藏
页码:29 / 42
页数:14
相关论文
共 22 条
  • [1] Abdelali Ahmed, 2016, P 2016 C N AM CHAPTE, P11
  • [2] Al-Ajlan MA, 2018, 2018 21ST SAUDI COMPUTER SOCIETY NATIONAL COMPUTER CONFERENCE (NCC)
  • [3] Towards Accurate Detection of Offensive Language in Online Communication in Arabic
    Alakrot, Azalden
    Murray, Liam
    Nikolov, Nikola S.
    [J]. ARABIC COMPUTATIONAL LINGUISTICS, 2018, 142 : 315 - 320
  • [4] Alam Kazi Saeed, 2021, Proceedings of the Third International Conference on Intelligent Communication Technologies and Virtual Mobile Networks (ICICV 2020), P710, DOI 10.1109/ICICV50876.2021.9388499
  • [5] Alduailej AH, 2017, 2017 INTERNATIONAL CONFERENCE ON COMPUTER AND APPLICATIONS (ICCA), P389, DOI 10.1109/COMAPP.2017.8079791
  • [6] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alduailej, Alhanouf
    Alothaim, Abdulrahman
    [J]. JOURNAL OF BIG DATA, 2022, 9 (01)
  • [7] Brown V, 2011, NATL CTR SOC RES, V1, P1
  • [8] Dalvi RR, 2020, PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICICCS 2020), P297, DOI [10.1109/iciccs48265.2020.9120893, 10.1109/ICICCS48265.2020.9120893]
  • [9] Arabic Cyberbullying Detection: Enhancing Performance by Using Ensemble Machine Learning
    Haidar, Batoul
    Chamoun, Maroun
    Serhrouchni, Ahmed
    [J]. 2019 INTERNATIONAL CONFERENCE ON INTERNET OF THINGS (ITHINGS) AND IEEE GREEN COMPUTING AND COMMUNICATIONS (GREENCOM) AND IEEE CYBER, PHYSICAL AND SOCIAL COMPUTING (CPSCOM) AND IEEE SMART DATA (SMARTDATA), 2019, : 323 - 327
  • [10] Haidar B, 2017, 2017 1ST CYBER SECURITY IN NETWORKING CONFERENCE (CSNET)