Natural Language Processing and Sentiment Analysis on Bangla Social Media Comments on Russia-Ukraine War Using Transformers

被引：8

作者：

Hasan, Mahmud ^{[1
]}

Islam, Labiba ^{[1
]}

Jahan, Ismat ^{[1
]}

Meem, Sabrina Mannan ^{[1
]}

Rahman, Rashedur M. ^{[1
]}

机构：

[1] North South Univ, Dept Elect & Comp Engn, Dhaka 1229, Bangladesh

来源：

VIETNAM JOURNAL OF COMPUTER SCIENCE | 2023年 / 10卷 / 03期

关键词：

Natural language processing; sentiment analysis; transformers; Russia-Ukraine war;

D O I：

10.1142/S2196888823500021

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The Bangla Language ranks seventh in the list of most spoken languages with 265 native and non-native speakers around the world and the second Indo-Aryan language after Hindi. However, the growth of research for tasks such as sentiment analysis (SA) in Bangla is relatively low compared to SA in the English language. It is because there are not enough high-quality publically available datasets for training language models for text classification tasks in Bangla. In this paper, we propose a Bangla annotated dataset for sentiment analysis on the ongoing Ukraine-Russia war. The dataset was developed by collecting Bangla comments from various videos of three prominent YouTube TV news channels of Bangladesh covering their report on the ongoing conflict. A total of 10,861 Bangla comments were collected and labeled with three polarity sentiments, namely Neutral, Pro-Ukraine (Positive), and Pro-Russia (Negative). A benchmark classifier was developed by experimenting with several transformer-based language models all pre-trained on unlabeled Bangla corpus. The models were fine-tuned using our procured dataset. Hyperparameter optimization was performed on all 5 transformer language models which include: BanglaBERT, XLM-RoBERTa-base, XLM-RoBERTa-large, Distil-mBERT and mBERT. Each model was evaluated and analyzed using several evaluation metrics which include: F1 score, accuracy, and AIC (Akaike Information Criterion). The best-performing model achieved the highest accuracy of 86% with 0.82 F1 score. Based on accuracy, F1 score and AIC, BanglaBERT outperforms baseline and all the other transformer-based classifiers.

引用

页码：329 / 356

页数：28

共 50 条

[41] An Assessment of Mentions of Adverse Drug Events on Social Media With Natural Language Processing: Model Development and Analysis
Yu, Deahan
Vydiswaran, V. G. Vinod
JMIR MEDICAL INFORMATICS, 2022, 10 (09)
[42] Identifying adverse drug reactions from patient reviews on social media using natural language processing
Oyebode, Oladapo
Orji, Rita
HEALTH INFORMATICS JOURNAL, 2023, 29 (01)
[43] COVID-19 Pandemic: Identifying Key Issues Using Social Media and Natural Language Processing
Oladapo Oyebode
Chinenye Ndulue
Dinesh Mulchandani
Banuchitra Suruliraj
Ashfaq Adib
Fidelia Anulika Orji
Evangelos Milios
Stan Matwin
Rita Orji
Journal of Healthcare Informatics Research, 2022, 6 : 174 - 207
[44] Analysis of sentiment in the European Central Bank's social media activity during the Covid-19 pandemic and Ukraine War: A navigating crisis communication
Tasente, Tanase
Caratas, Maria Alina
Alabdullah, Tariq Tawfeeq Yousif
DOXA COMUNICACION, 2024, (38): : 275 - 292
[45] Social Media as a Sensor: Analyzing Twitter Data for Breast Cancer Medication Effects Using Natural Language Processing
Kobara, Seibi
Rafiei, Alireza
Nateghi, Masoud
Bozkurt, Selen
Kamaleswaran, Rishikesan
Sarker, Abeed
ARTIFICIAL INTELLIGENCE IN MEDICINE, PT I, AIME 2024, 2024, 14844 : 345 - 354
[46] Detection of social media platform insults using Natural language processing and comparative study of machine learning algorithms
Chiramel, Sruthi
Logofatu, Doina
Goldenthal, Gheorghe
2020 24TH INTERNATIONAL CONFERENCE ON SYSTEM THEORY, CONTROL AND COMPUTING (ICSTCC), 2020, : 98 - 101
[47] Analysis of depression in social media texts through the Patient Health Questionnaire-9 and natural language processing
Kim, Nam Hyeok
Kim, Ji Min
Park, Da Mi
Ji, Su Ryeon
Kim, Jong Woo
DIGITAL HEALTH, 2022, 8
[48] Off-The-Shelf Artificial Intelligence Technologies for Sentiment and Emotion Analysis: A Tutorial on Using IBM Natural Language Processing
Carvalho, Arthur
Levitt, Adam
Levitt, Seth
Khaddam, Edward
Benamati, John
COMMUNICATIONS OF THE ASSOCIATION FOR INFORMATION SYSTEMS, 2019, 44 (01): : 918 - 943
[49] Exploring stakeholders' opinions on circular economy in the construction sector: A natural language processing analysis of social media discourse
Tleuken, Aidana
Orel, Daniil
Iskakova, Anel
Varol, Huseyin Atakan
Karaca, Ferhat
JOURNAL OF INDUSTRIAL ECOLOGY, 2024, 28 (04) : 853 - 867
[50] Review-Based Sentiment Prediction of Rating Using Natural Language Processing Sentence-Level Sentiment Analysis with Bag-of-Words Approach
Raju, K. Venkata
Sridhar, M.
FIRST INTERNATIONAL CONFERENCE ON SUSTAINABLE TECHNOLOGIES FOR COMPUTATIONAL INTELLIGENCE, 2020, 1045 : 807 - 821

← 1 2 3 4 5 →