Enhancing Moroccan Dialect Sentiment Analysis Through Optimized Preprocessing and Transfer Learning Techniques

被引:0
|
作者
Matrane, Yassir [1 ]
Benabbou, Faouzia [1 ]
Ellaky, Zineb [1 ]
机构
[1] Hassan II Univ Casablanca, Fac Sci Ben MSick, Lab Informat Technol & Modeling, Casablanca 20000, Morocco
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Feature extraction; Sentiment analysis; Accuracy; Natural language processing; Linguistics; Transfer learning; Analytical models; Text categorization; Complexity theory; Tuning; Arabic dialect; Moroccan dialect; preprocessing; deep learning; fine-tuning; ARABIC TEXT CLASSIFICATION; STEMMER;
D O I
10.1109/ACCESS.2024.3514934
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This work investigates the challenges of sentiment analysis for Moroccan Arabic dialect (MD), where the lack of dialect-specific preprocessing methods complicates natural language processing tasks and affects sentiment classification performance. The research evaluates various preprocessing techniques, including stemming and feature extraction, using two main transfer learning approaches: feature extraction with deep learning models and fine-tuning pre-trained models. Experimentations were conducted on four MD datasets to assess combinations of stemmers, feature extractors, and architectures. In the feature extraction approach, omitting stemming and employing the QARiB feature extractor with a BiGRU model yielded the highest accuracy on the FB and MAC datasets, reaching 90.45% and 75.50%, respectively. In the fine-tuning approach, DarijaBERT excelled on the FB dataset with an accuracy of 93.37% and an F1-score of 88.55%, while QaRIB and AraBERT performed comparably well on the MAC and MSAC datasets. Results suggest that excluding base form reduction methods, such as stemming and lemmatization, during fine-tuning enhances sentiment analysis performance in MD, highlighting the limitations of Modern Standard Arabic techniques for MD processing. This study provides valuable insights for improving Natural language processing (NLP) applications in Arabic dialects, particularly in sentiment analysis, by optimizing model performance without relying on standard preprocessing methods.
引用
收藏
页码:187756 / 187777
页数:22
相关论文
共 50 条
  • [1] Enhancing Sentiment Analysis on Social Media with Novel Preprocessing Techniques
    Eljil, Khouloud Safi
    Nait-Abdesselam, Farid
    Hamouda, Essia
    Hamdi, Mohamed
    JOURNAL OF ADVANCES IN INFORMATION TECHNOLOGY, 2023, 14 (06) : 1206 - 1213
  • [2] Improving Sentiment Analysis Performance on Imbalanced Moroccan Dialect Datasets Using Resample and Feature Extraction Techniques
    Nassr, Zineb
    Benabbou, Faouzia
    Sael, Nawal
    Hamim, Touria
    INFORMATION, 2025, 16 (01)
  • [3] Sentiment Analysis of Saudi Dialect Using Deep Learning Techniques
    Alahmary, Rahma M.
    Al-Dossari, Hmood Z.
    Emam, Ahmed Z.
    2019 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2019, : 562 - 567
  • [4] A Review on Text Sentiment Analysis With Machine Learning and Deep Learning Techniques
    Mamani-Coaquira, Yonatan
    Villanueva, Edwin
    IEEE ACCESS, 2024, 12 : 193115 - 193130
  • [5] Enhancing deep learning sentiment analysis with ensemble techniques in social applications
    Araque, Oscar
    Corcuera-Platas, Ignacio
    Sanchez-Rada, J. Fernando
    Iglesias, Carlos A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2017, 77 : 236 - 246
  • [6] Enhancing Ovarian Tumor Dataset Analysis Through Data Mining Preprocessing Techniques
    Shetty, Roopashri
    Geetha, M.
    Dinesh Acharya, U.
    Shyamala, G.
    IEEE ACCESS, 2024, 12 : 122300 - 122312
  • [7] The effect of preprocessing techniques on Twitter Sentiment Analysis
    Krouska, Akrivi
    Troussas, Christos
    Virvou, Maria
    2016 7TH INTERNATIONAL CONFERENCE ON INFORMATION, INTELLIGENCE, SYSTEMS & APPLICATIONS (IISA), 2016,
  • [8] Sentiment Analysis on Moroccan Dialect of Arabic Combining NLP and ML Methods
    Ladrham, Khalil
    Gueddah, Hicham
    ARABIC LANGUAGE PROCESSING: FROM THEORY TO PRACTICE, ICALP 2023, PT I, 2025, 2339 : 3 - 16
  • [10] Sentiment analysis dataset in Moroccan dialect: bridging the gap between Arabic and Latin scripted dialect
    Jbel, Mouad
    Jabrane, Mourad
    Hafidi, Imad
    Metrane, Abdulmutallib
    LANGUAGE RESOURCES AND EVALUATION, 2024, : 1401 - 1430