AraXLNet: pre-trained language model for sentiment analysis of Arabic

被引:0
|
作者
Alhanouf Alduailej
Abdulrahman Alothaim
机构
[1] King Saud University,Department of Information Systems, College of Computer and Information Sciences
来源
关键词
Sentiment analysis; Language models; NLP; XLNet; AraXLNet; Text mining;
D O I
暂无
中图分类号
学科分类号
摘要
The Arabic language is a complex language with little resources; therefore, its limitations create a challenge to produce accurate text classification tasks such as sentiment analysis. The main goal of sentiment analysis is to determine the overall orientation of a given text in terms of whether it is positive, negative, or neutral. Recently, language models have shown great results in promoting the accuracy of text classification in English. The models are pre-trained on a large dataset and then fine-tuned on the downstream tasks. Particularly, XLNet has achieved state-of-the-art results for diverse natural language processing (NLP) tasks in English. In this paper, we hypothesize that such parallel success can be achieved in Arabic. The paper aims to support this hypothesis by producing the first XLNet-based language model in Arabic called AraXLNet, demonstrating its use in Arabic sentiment analysis in order to improve the prediction accuracy of such tasks. The results showed that the proposed model, AraXLNet, with Farasa segmenter achieved an accuracy results of 94.78%, 93.01%, and 85.77% in sentiment analysis task for Arabic using multiple benchmark datasets. This result outperformed AraBERT that obtained 84.65%, 92.13%, and 85.05% on the same datasets, respectively. The improved accuracy of the proposed model was evident using multiple benchmark datasets, thus offering promising advancement in the Arabic text classification tasks.
引用
收藏
相关论文
共 50 条
  • [1] AraXLNet: pre-trained language model for sentiment analysis of Arabic
    Alduailej, Alhanouf
    Alothaim, Abdulrahman
    JOURNAL OF BIG DATA, 2022, 9 (01)
  • [2] Leveraging Pre-trained Language Model for Speech Sentiment Analysis
    Shon, Suwon
    Brusco, Pablo
    Pan, Jing
    Han, Kyu J.
    Watanabe, Shinji
    INTERSPEECH 2021, 2021, : 3420 - 3424
  • [3] A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis
    Zouidine, Mohamed
    Khalil, Mohammed
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1243 - 1248
  • [4] Incorporating emoji sentiment information into a pre-trained language model for Chinese and English sentiment analysis
    Huang, Jiaming
    Li, Xianyong
    Li, Qizhi
    Du, Yajun
    Fan, Yongquan
    Chen, Xiaoliang
    Huang, Dong
    Wang, Shumin
    Li, Xianyong
    INTELLIGENT DATA ANALYSIS, 2024, 28 (06) : 1601 - 1625
  • [5] Aspect Based Sentiment Analysis by Pre-trained Language Representations
    Liang Tianxin
    Yang Xiaoping
    Zhou Xibo
    Wang Bingqian
    2019 IEEE INTL CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, BIG DATA & CLOUD COMPUTING, SUSTAINABLE COMPUTING & COMMUNICATIONS, SOCIAL COMPUTING & NETWORKING (ISPA/BDCLOUD/SOCIALCOM/SUSTAINCOM 2019), 2019, : 1262 - 1265
  • [6] TwitterBERT: Framework for Twitter Sentiment Analysis Based on Pre-trained Language Model Representations
    Azzouza, Noureddine
    Akli-Astouati, Karima
    Ibrahim, Roliana
    EMERGING TRENDS IN INTELLIGENT COMPUTING AND INFORMATICS: DATA SCIENCE, INTELLIGENT INFORMATION SYSTEMS AND SMART COMPUTING, 2020, 1073 : 428 - 437
  • [7] Pre-Trained Language Model Ensemble for Arabic Fake News Detection
    Al-Zahrani, Lama
    Al-Yahya, Maha
    MATHEMATICS, 2024, 12 (18)
  • [8] Comparing Pre-Trained Language Model for Arabic Hate Speech Detection
    Daouadi, Kheir Eddine
    Boualleg, Yaakoub
    Guehairia, Oussama
    COMPUTACION Y SISTEMAS, 2024, 28 (02): : 681 - 693
  • [9] Enhancing Turkish Sentiment Analysis Using Pre-Trained Language Models
    Koksal, Omer
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [10] Incorporating Dynamic Semantics into Pre-Trained Language Model for Aspect-based Sentiment Analysis
    Zhang, Kai
    Zhang, Kun
    Zhang, Mengdi
    Zhao, Hongke
    Liu, Qi
    Wu, Wei
    Chen, Enhong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3599 - 3610