Detecting Arabic Offensive Language in Microblogs Using Domain-Specific Word Embeddings and Deep Learning

被引:4
作者
Aljuhani, Khulood O. [1 ]
Alyoubi, Khaled H. [1 ]
Alotaibi, Fahd S. [1 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol, Informat Syst Dept, Jeddah, Saudi Arabia
来源
TEHNICKI GLASNIK-TECHNICAL JOURNAL | 2022年 / 16卷 / 03期
关键词
Arabic Natural Language Processing; Arabic Tweets; Offensive Language Detection; Offensive Language; Word Embeddings;
D O I
10.31803/tg-20220305120018
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In recent years, social media networks are emerging as a key player by providing platforms for opinions expression, communication, and content distribution. However, users often take advantage of perceived anonymity on social media platforms to share offensive or hateful content. Thus, offensive language has grown as a significant issue with the increase in online communication and the popularity of social media platforms. This problem has attracted significant attention for devising methods for detecting offensive content and preventing its spread on online social networks. Therefore, this paper aims to develop an effective Arabic offensive language detection model by employing deep learning and semantic and contextual features. This paper proposes a deep learning approach that utilizes the bidirectional long short-term memory (BiLSTM) model and domain-specific word embeddings extracted from an Arabic offensive dataset. The detection approach was evaluated on an Arabic dataset collected from Twitter. The results showed the highest performance accuracy of 0.93% with the BiLSTM model trained using a combination of domain-specific and agnostic-domain word embeddings.
引用
收藏
页码:394 / 400
页数:7
相关论文
共 24 条
  • [1] Abozinadah EA, 2015, INT J KNOWL ENG, V1, P113, DOI [DOI 10.7763/IJKE.2015.V1.19, 10.7763/IJKE.2015.V1.19]
  • [2] Towards Accurate Detection of Offensive Language in Online Communication in Arabic
    Alakrot, Azalden
    Murray, Liam
    Nikolov, Nikola S.
    [J]. ARABIC COMPUTATIONAL LINGUISTICS, 2018, 142 : 315 - 320
  • [3] [Anonymous], 2020, ARABIC LANGUAGE REPO
  • [4] [Anonymous], 2021, 2020 ANN SOCIAL MEDI
  • [5] Appen, 2020, CONF DEPL AI WORLD C
  • [6] Detecting Offensive Language in Social Media to Protect Adolescent Online Safety
    Chen, Ying
    Zhou, Yilu
    Zhu, Sencun
    Xu, Heng
    [J]. PROCEEDINGS OF 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON PRIVACY, SECURITY, RISK AND TRUST AND 2012 ASE/IEEE INTERNATIONAL CONFERENCE ON SOCIAL COMPUTING (SOCIALCOM/PASSAT 2012), 2012, : 71 - 80
  • [7] Davidson T., 2017, ICWSM, V11, P512
  • [8] Gaydhani A., 2018, ARXIV
  • [9] GitHub-bakrianoo/aravec, ARAVEC IS PRETR DIST
  • [10] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1007/978-3-642-24797-2, 10.1162/neco.1997.9.1.1]