Natural Language Processing and Sentiment Analysis on Bangla Social Media Comments on Russia-Ukraine War Using Transformers

被引:8
|
作者
Hasan, Mahmud [1 ]
Islam, Labiba [1 ]
Jahan, Ismat [1 ]
Meem, Sabrina Mannan [1 ]
Rahman, Rashedur M. [1 ]
机构
[1] North South Univ, Dept Elect & Comp Engn, Dhaka 1229, Bangladesh
关键词
Natural language processing; sentiment analysis; transformers; Russia-Ukraine war;
D O I
10.1142/S2196888823500021
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The Bangla Language ranks seventh in the list of most spoken languages with 265 native and non-native speakers around the world and the second Indo-Aryan language after Hindi. However, the growth of research for tasks such as sentiment analysis (SA) in Bangla is relatively low compared to SA in the English language. It is because there are not enough high-quality publically available datasets for training language models for text classification tasks in Bangla. In this paper, we propose a Bangla annotated dataset for sentiment analysis on the ongoing Ukraine-Russia war. The dataset was developed by collecting Bangla comments from various videos of three prominent YouTube TV news channels of Bangladesh covering their report on the ongoing conflict. A total of 10,861 Bangla comments were collected and labeled with three polarity sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-Russia (Negative). A benchmark classifier was developed by experimenting with several transformer-based language models all pre-trained on unlabeled Bangla corpus. The models were fine-tuned using our procured dataset. Hyperparameter optimization was performed on all 5 transformer language models which include: BanglaBERT, XLM-RoBERTa-base, XLM-RoBERTa-large, Distil-mBERT and mBERT. Each model was evaluated and analyzed using several evaluation metrics which include: F1 score, accuracy, and AIC (Akaike Information Criterion). The best-performing model achieved the highest accuracy of 86% with 0.82 F1 score. Based on accuracy, F1 score and AIC, BanglaBERT outperforms baseline and all the other transformer-based classifiers.
引用
收藏
页码:329 / 356
页数:28
相关论文
共 50 条
  • [41] An Assessment of Mentions of Adverse Drug Events on Social Media With Natural Language Processing: Model Development and Analysis
    Yu, Deahan
    Vydiswaran, V. G. Vinod
    JMIR MEDICAL INFORMATICS, 2022, 10 (09)
  • [42] Identifying adverse drug reactions from patient reviews on social media using natural language processing
    Oyebode, Oladapo
    Orji, Rita
    HEALTH INFORMATICS JOURNAL, 2023, 29 (01)
  • [43] COVID-19 Pandemic: Identifying Key Issues Using Social Media and Natural Language Processing
    Oladapo Oyebode
    Chinenye Ndulue
    Dinesh Mulchandani
    Banuchitra Suruliraj
    Ashfaq Adib
    Fidelia Anulika Orji
    Evangelos Milios
    Stan Matwin
    Rita Orji
    Journal of Healthcare Informatics Research, 2022, 6 : 174 - 207
  • [44] Analysis of sentiment in the European Central Bank's social media activity during the Covid-19 pandemic and Ukraine War: A navigating crisis communication
    Tasente, Tanase
    Caratas, Maria Alina
    Alabdullah, Tariq Tawfeeq Yousif
    DOXA COMUNICACION, 2024, (38): : 275 - 292
  • [45] Social Media as a Sensor: Analyzing Twitter Data for Breast Cancer Medication Effects Using Natural Language Processing
    Kobara, Seibi
    Rafiei, Alireza
    Nateghi, Masoud
    Bozkurt, Selen
    Kamaleswaran, Rishikesan
    Sarker, Abeed
    ARTIFICIAL INTELLIGENCE IN MEDICINE, PT I, AIME 2024, 2024, 14844 : 345 - 354
  • [46] Detection of social media platform insults using Natural language processing and comparative study of machine learning algorithms
    Chiramel, Sruthi
    Logofatu, Doina
    Goldenthal, Gheorghe
    2020 24TH INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2020, : 98 - 101
  • [47] Analysis of depression in social media texts through the Patient Health Questionnaire-9 and natural language processing
    Kim, Nam Hyeok
    Kim, Ji Min
    Park, Da Mi
    Ji, Su Ryeon
    Kim, Jong Woo
    DIGITAL HEALTH, 2022, 8
  • [48] Off-The-Shelf Artificial Intelligence Technologies for Sentiment and Emotion Analysis: A Tutorial on Using IBM Natural Language Processing
    Carvalho, Arthur
    Levitt, Adam
    Levitt, Seth
    Khaddam, Edward
    Benamati, John
    COMMUNICATIONS OF THE ASSOCIATION FOR INFORMATION SYSTEMS, 2019, 44 (01): : 918 - 943
  • [49] Exploring stakeholders' opinions on circular economy in the construction sector: A natural language processing analysis of social media discourse
    Tleuken, Aidana
    Orel, Daniil
    Iskakova, Anel
    Varol, Huseyin Atakan
    Karaca, Ferhat
    JOURNAL OF INDUSTRIAL ECOLOGY, 2024, 28 (04) : 853 - 867
  • [50] Review-Based Sentiment Prediction of Rating Using Natural Language Processing Sentence-Level Sentiment Analysis with Bag-of-Words Approach
    Raju, K. Venkata
    Sridhar, M.
    FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR COMPUTATIONAL INTELLIGENCE, 2020, 1045 : 807 - 821