Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniques

被引：2

作者：

Elmitwalli, Sherif ^{[1
]}

Mehegan, John ^{[1
]}

机构：

[1] Univ Bath, Dept Hlth, Tobacco Control Res Grp, Bath, England

来源：

FRONTIERS IN BIG DATA | 2024年 / 7卷

关键词：

sentiment analysis; lexicon-based; Bi-LSTM; BERT; GPT-3; COP9; LLMS; SOCIAL MEDIA; TWITTER;

D O I：

10.3389/fdata.2024.1357926

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Introduction Sentiment analysis has become a crucial area of research in natural language processing in recent years. The study aims to compare the performance of various sentiment analysis techniques, including lexicon-based, machine learning, Bi-LSTM, BERT, and GPT-3 approaches, using two commonly used datasets, IMDB reviews and Sentiment140. The objective is to identify the best-performing technique for an exemplar dataset, tweets associated with the WHO Framework Convention on Tobacco Control Ninth Conference of the Parties in 2021 (COP9).Methods A two-stage evaluation was conducted. In the first stage, various techniques were compared on standard sentiment analysis datasets using standard evaluation metrics such as accuracy, F1-score, and precision. In the second stage, the best-performing techniques from the first stage were applied to partially annotated COP9 conference-related tweets.Results In the first stage, BERT achieved the highest F1-scores (0.9380 for IMDB and 0.8114 for Sentiment 140), followed by GPT-3 (0.9119 and 0.7913) and Bi-LSTM (0.8971 and 0.7778). In the second stage, GPT-3 performed the best for sentiment analysis on partially annotated COP9 conference-related tweets, with an F1-score of 0.8812.Discussion The study demonstrates the effectiveness of pre-trained models like BERT and GPT-3 for sentiment analysis tasks, outperforming traditional techniques on standard datasets. Moreover, the better performance of GPT-3 on the partially annotated COP9 tweets highlights its ability to generalize well to domain-specific data with limited annotations. This provides researchers and practitioners with a viable option of using pre-trained models for sentiment analysis in scenarios with limited or no annotated data across different domains.

引用

页数：18

共 22 条

[21] Sentiment Analysis of Insomnia-Related Tweets via a Combination of Transformers Using Dempster-Shafer Theory: Pre- and Peri-COVID-19 Pandemic Retrospective Study
Maghsoudi, Arash
Nowakowski, Sara
Agrawal, Ritwick
Sharafkhaneh, Amir
Kunik, Mark E.
Naik, Aanand
Xu, Hua
Razjouyan, Javad
JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (12)
[22] Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic
Gulati, Kamal
Kumar, S. Saravana
Boddu, Raja Sarath Kumar
Sarvakar, Ketan
Sharma, Dilip Kumar
Nomani, M. Z. M.
MATERIALS TODAY-PROCEEDINGS, 2022, 51 : 38 - 41

← 1 2 3 →