Enhancing Moroccan Dialect Sentiment Analysis Through Optimized Preprocessing and Transfer Learning Techniques

被引:0
|
作者
Matrane, Yassir [1 ]
Benabbou, Faouzia [1 ]
Ellaky, Zineb [1 ]
机构
[1] Hassan II Univ Casablanca, Fac Sci Ben MSick, Lab Informat Technol & Modeling, Casablanca 20000, Morocco
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Sentiment analysis; Accuracy; Natural language processing; Linguistics; Transfer learning; Analytical models; Text categorization; Complexity theory; Tuning; Arabic dialect; Moroccan dialect; preprocessing; deep learning; fine-tuning; ARABIC TEXT CLASSIFICATION; STEMMER;
D O I
10.1109/ACCESS.2024.3514934
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work investigates the challenges of sentiment analysis for Moroccan Arabic dialect (MD), where the lack of dialect-specific preprocessing methods complicates natural language processing tasks and affects sentiment classification performance. The research evaluates various preprocessing techniques, including stemming and feature extraction, using two main transfer learning approaches: feature extraction with deep learning models and fine-tuning pre-trained models. Experimentations were conducted on four MD datasets to assess combinations of stemmers, feature extractors, and architectures. In the feature extraction approach, omitting stemming and employing the QARiB feature extractor with a BiGRU model yielded the highest accuracy on the FB and MAC datasets, reaching 90.45% and 75.50%, respectively. In the fine-tuning approach, DarijaBERT excelled on the FB dataset with an accuracy of 93.37% and an F1-score of 88.55%, while QaRIB and AraBERT performed comparably well on the MAC and MSAC datasets. Results suggest that excluding base form reduction methods, such as stemming and lemmatization, during fine-tuning enhances sentiment analysis performance in MD, highlighting the limitations of Modern Standard Arabic techniques for MD processing. This study provides valuable insights for improving Natural language processing (NLP) applications in Arabic dialects, particularly in sentiment analysis, by optimizing model performance without relying on standard preprocessing methods.
引用
收藏
页码:187756 / 187777
页数:22
相关论文
共 50 条
  • [21] Sentiment Analysis of Algerian Dialect Using Machine Learning and Deep Learning with Word2vec
    Mazari, Ahmed Cherif
    Djeffal, Abdelhamid
    INFORMATICA-AN INTERNATIONAL JOURNAL OF COMPUTING AND INFORMATICS, 2022, 46 (06): : 67 - 78
  • [22] Applying Transfer Learning to Sentiment Analysis in Social Media
    de Arriba, Ariadna
    Oriol, Marc
    Franch, Xavier
    29TH IEEE INTERNATIONAL REQUIREMENTS ENGINEERING CONFERENCE WORKSHOPS (REW 2021), 2021, : 342 - 348
  • [23] Toward multi-label sentiment analysis: a transfer learning based approach
    Tao, Jie
    Fang, Xing
    JOURNAL OF BIG DATA, 2020, 7 (01)
  • [24] Toward multi-label sentiment analysis: a transfer learning based approach
    Jie Tao
    Xing Fang
    Journal of Big Data, 7
  • [25] AfriDial: African Dialect Model based on Deep Learning for Sentiment Analysis
    Sassi, Ameni
    Tonga, Junior
    Poaty, Stephanie
    Steve, Sanon
    Adjid, Djibrine Idriss Abakar
    Cherif, Moukhtar
    Ouarda, Wael
    20TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC 2024, 2024, : 1248 - 1254
  • [26] The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis
    Saqib Alam
    Nianmin Yao
    Computational and Mathematical Organization Theory, 2019, 25 : 319 - 335
  • [27] The impact of preprocessing steps on the accuracy of machine learning algorithms in sentiment analysis
    Alam, Saqib
    Yao, Nianmin
    COMPUTATIONAL AND MATHEMATICAL ORGANIZATION THEORY, 2019, 25 (03) : 319 - 335
  • [28] Deep Learning Techniques for Polarity Classification in Multimodal Sentiment Analysis
    Mahendhiran, P. D.
    Kannimuthu, S.
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & DECISION MAKING, 2018, 17 (03) : 883 - 910
  • [29] Novel Sentiment Majority Voting Classifier and Transfer Learning-Based Feature Engineering for Sentiment Analysis of Deepfake Tweets
    Khalid, Madiha
    Raza, Ali
    Younas, Faizan
    Rustam, Furqan
    Villar, Monica Gracia
    Ashraf, Imran
    Akhtar, Adnan
    IEEE ACCESS, 2024, 12 : 67117 - 67129
  • [30] Multilingual, monolingual and mono-dialectal transfer learning for Moroccan Arabic sentiment classification
    Boudad, Naaima
    Faizi, Rdouan
    Thami, Rachid Oulad Haj
    SOCIAL NETWORK ANALYSIS AND MINING, 2023, 14 (01)