Evaluating transfer learning approach for detecting Arabic anti-refugee/migrant speech on social media

被引:6
作者
Mohdeb, Djamila [1 ]
Laifa, Meriem [1 ,2 ]
Zerargui, Fayssal [1 ]
Benzaoui, Omar [1 ]
机构
[1] Univ Bordj Bou Arreridj, Bordj Bou Arreridj, Algeria
[2] Lab Informat & Its Applicat Msila LIAM, Msila, Algeria
关键词
Hate speech; Anti-migrant speech; Algerian dialectal Arabic; African migrants; Transfer learning; Arabic natural language processing; HATE SPEECH;
D O I
10.1108/AJIM-10-2021-0293
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Purpose The present study was designed to investigate eight research questions that are related to the analysis and the detection of dialectal Arabic hate speech that targeted African refugees and illegal migrants on the YouTube Algerian space. Design/methodology/approach The transfer learning approach which recently presents the state-of-the-art approach in natural language processing tasks has been exploited to classify and detect hate speech in Algerian dialectal Arabic. Besides, a descriptive analysis has been conducted to answer the analytical research questions that aim at measuring and evaluating the presence of the anti-refugee/migrant discourse on the YouTube social platform. Findings Data analysis revealed that there has been a gradual modest increase in the number of anti-refugee/migrant hateful comments on YouTube since 2014, a sharp rise in 2017 and a sharp decline in later years until 2021. Furthermore, our findings stemming from classifying hate content using multilingual and monolingual pre-trained language transformers demonstrate a good performance of the AraBERT monolingual transformer in comparison with the monodialectal transformer DziriBERT and the cross-lingual transformers mBERT and XLM-R. Originality/value Automatic hate speech detection in languages other than English is quite a challenging task that the literature has tried to address by various approaches of machine learning. Although the recent approach of cross-lingual transfer learning offers a promising solution, tackling this problem in the context of the Arabic language, particularly dialectal Arabic makes it even more challenging. Our results cast a new light on the actual ability of the transfer learning approach to deal with low-resource languages that widely differ from high-resource languages as well as other Latin-based, low-resource languages.
引用
收藏
页码:1070 / 1088
页数:19
相关论文
共 50 条
  • [1] Abdaoui A., 2021, ARXIV PREPRINT ARXIV
  • [2] Agarwal S., 2017, ARXIV PREPRINT ARXIV
  • [3] Detection of hate speech in Arabic tweets using deep learning
    Al-Hassan, Areej
    Al-Dossari, Hmood
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (06) : 1963 - 1974
  • [4] Detecting White Supremacist Hate Speech Using Domain Specific Word Embedding With Deep Learning and BERT
    Alatawi, Hind S.
    Alhothali, Areej M.
    Moria, Kawthar M.
    [J]. IEEE ACCESS, 2021, 9 : 106363 - 106374
  • [5] Alorainy W., 2018, ARXIV PREPRINT ARXIV
  • [6] A Deep Learning Approach for Automatic Hate Speech Detection in the Saudi Twittersphere
    Alshalan, Raghad
    Al-Khalifa, Hend
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 16
  • [7] Antoun Wissam, 2020, P 4 WORKSH OP SOURC, P9
  • [8] Attia S., 2017, JEUNE AFRIQUE 0720
  • [9] Bigoulaeva I., 2021, P 1 WORKSHOP LANGUAG, P15
  • [10] Cyber Hate Speech on Twitter: An Application of Machine Classification and Statistical Modeling for Policy and Decision Making
    Burnap, Pete
    Williams, Matthew L.
    [J]. POLICY AND INTERNET, 2015, 7 (02): : 223 - 242