BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language

被引:4
作者
Ashraf, Muhammad Rehan [1 ,2 ]
Jana, Yasmeen [2 ]
Umer, Qasim [2 ,3 ]
Jaffar, M. Arfan [1 ]
Chung, Sungwook [4 ]
Ramay, Waheed Yousuf [5 ]
机构
[1] Super Univ, Dept Comp Sci, Lahore 54000, Pakistan
[2] COMSATS Univ Islamabad, Dept Comp Sci, Vehari 61000, Pakistan
[3] Hanyang Univ, Dept Comp Sci, Seoul 04763, South Korea
[4] Changwon Natl Univ, Dept Comp Engn, Chang Won 51140, South Korea
[5] Air Univ, Dept Comp Sci, Multan 60000, Pakistan
来源
IEEE ACCESS | 2023年 / 11卷
关键词
Sentiment analysis; Support vector machines; Social networking (online); Sports; Blogs; Encoding; Natural language processing; Linguistics; Urdu; BERT; classification; sentiment analysis; ROMAN URDU; CLASSIFICATION; MACHINE;
D O I
10.1109/ACCESS.2023.3322101
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Sentiment analysis holds significant importance in research projects by providing valuable insights into public opinions. However, the majority of sentiment analysis studies focus on the English language, leaving a gap in research for other low-resourced languages or regional languages, e.g., Persian, Pashto, and Urdu. Moreover, computational linguists face the challenge of developing lexical resources for these languages. In light of this, this paper presents a deep learning-based approach for Urdu Text Sentiment Analysis (USA-BERT), leveraging Bidirectional Encoder Representations from Transformers and introduces an Urdu Dataset for Sentiment Analysis-23 (UDSA-23). USA-BERT first preprocesses the Urdu reviews by exploiting BERT-Tokenizer. Second, it creates BERT embeddings for each Urdu review. Third, given the BERT embeddings, it fine-tunes a deep learning classifier (BERT). Finally, it employs the Pareto principle on two datasets (the state-of-the-art (UCSA-21) and UDSA-23) to assess USA-BERT. The assessment results demonstrate that USA-BERT significantly surpasses the existing methods by improving the accuracy and f-measure up to 26.09% and 25.87%, respectively.
引用
收藏
页码:110245 / 110259
页数:15
相关论文
共 50 条
  • [1] Explainable Pre-Trained Language Models for Sentiment Analysis in Low-Resourced Languages
    Mabokela, Koena Ronny
    Primus, Mpho
    Celik, Turgay
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (11)
  • [2] Sentiment Analysis of Reviews in Natural Language: Roman Urdu as a Case Study
    Qureshi, Muhammad Aasim
    Asif, Muhammad
    Hassan, Mohd Fadzil
    Abid, Adnan
    Kamal, Asad
    Safdar, Sohail
    Akber, Rehan
    IEEE ACCESS, 2022, 10 : 24945 - 24954
  • [3] An Effective BERT-Based Pipeline for Twitter Sentiment Analysis: A Case Study in Italian
    Pota, Marco
    Ventura, Mirko
    Catelli, Rosario
    Esposito, Massimo
    SENSORS, 2021, 21 (01) : 1 - 21
  • [4] BERT-based Conformal Predictor for Sentiment Analysis
    Maltoudoglou, Lysimachos
    Paisios, Andreas
    Papadopoulos, Harris
    CONFORMAL AND PROBABILISTIC PREDICTION AND APPLICATIONS, VOL 128, 2020, 128 : 269 - 284
  • [5] BERT-Based Stock Market Sentiment Analysis
    Lee, Chien-Cheng
    Gao, Zhongjian
    Tsai, Chun-Li
    2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
  • [6] Bert-based graph unlinked embedding for sentiment analysis
    Jin, Youkai
    Zhao, Anping
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 2627 - 2638
  • [7] Bert-based graph unlinked embedding for sentiment analysis
    Youkai Jin
    Anping Zhao
    Complex & Intelligent Systems, 2024, 10 : 2627 - 2638
  • [8] BERT-Based Sentiment Analysis: A Software Engineering Perspective
    Batra, Himanshu
    Punn, Narinder Singh
    Sonbhadra, Sanjay Kumar
    Agarwal, Sonali
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2021, PT I, 2021, 12923 : 138 - 148
  • [9] Semantic and Sentiment Analysis of Selected Bhagavad Gita Translations Using BERT-Based Language Framework
    Chandra, Rohitash
    Kulkarni, Venkatesh
    IEEE ACCESS, 2022, 10 : 21291 - 21315
  • [10] BERT-Based Model for Aspect-Based Sentiment Analysis for Analyzing Arabic Open-Ended Survey Responses: A Case Study
    Alshaikh, Khloud A.
    Almatrafi, Omaima A.
    Abushark, Yoosef B.
    IEEE ACCESS, 2024, 12 : 2288 - 2302