Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniques

被引：2

作者：

Elmitwalli, Sherif ^{[1
]}

Mehegan, John ^{[1
]}

机构：

[1] Univ Bath, Dept Hlth, Tobacco Control Res Grp, Bath, England

来源：

FRONTIERS IN BIG DATA | 2024年 / 7卷

关键词：

sentiment analysis; lexicon-based; Bi-LSTM; BERT; GPT-3; COP9; LLMS; SOCIAL MEDIA; TWITTER;

D O I：

10.3389/fdata.2024.1357926

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Introduction Sentiment analysis has become a crucial area of research in natural language processing in recent years. The study aims to compare the performance of various sentiment analysis techniques, including lexicon-based, machine learning, Bi-LSTM, BERT, and GPT-3 approaches, using two commonly used datasets, IMDB reviews and Sentiment140. The objective is to identify the best-performing technique for an exemplar dataset, tweets associated with the WHO Framework Convention on Tobacco Control Ninth Conference of the Parties in 2021 (COP9).Methods A two-stage evaluation was conducted. In the first stage, various techniques were compared on standard sentiment analysis datasets using standard evaluation metrics such as accuracy, F1-score, and precision. In the second stage, the best-performing techniques from the first stage were applied to partially annotated COP9 conference-related tweets.Results In the first stage, BERT achieved the highest F1-scores (0.9380 for IMDB and 0.8114 for Sentiment 140), followed by GPT-3 (0.9119 and 0.7913) and Bi-LSTM (0.8971 and 0.7778). In the second stage, GPT-3 performed the best for sentiment analysis on partially annotated COP9 conference-related tweets, with an F1-score of 0.8812.Discussion The study demonstrates the effectiveness of pre-trained models like BERT and GPT-3 for sentiment analysis tasks, outperforming traditional techniques on standard datasets. Moreover, the better performance of GPT-3 on the partially annotated COP9 tweets highlights its ability to generalize well to domain-specific data with limited annotations. This provides researchers and practitioners with a viable option of using pre-trained models for sentiment analysis in scenarios with limited or no annotated data across different domains.

引用

页数：18

共 22 条

[1] A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis
Zouidine, Mohamed
Khalil, Mohammed
2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1243 - 1248
[2] A Comparative Study of Different Pre-trained Language Models for Sentiment Analysis of Human-Computer Negotiation Dialogue
Dong, Jing
Luo, Xudong
Zhu, Junlin
KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2024, 2024, 14887 : 301 - 317
[3] A Comparative Study on Pre-Trained Models Based on BERT
Zhang, Minghua
2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 326 - 330
[4] Pre-trained Word Embeddings for Arabic Aspect-Based Sentiment Analysis of Airline Tweets
Ashi, Mohammed Matuq
Siddiqui, Muazzam Ahmed
Nadeem, Farrukh
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2018, 2019, 845 : 241 - 251
[5] Aspect Based Sentiment Analysis using French Pre-Trained Models
Essebbar, Abderrahman
Kane, Bamba
Guinaudeau, Ophelie
Chiesa, Valeria
Quenel, Ilhem
Chau, Stephane
ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
[6] Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning
Bu, Kun
Liu, Yuanchao
Ju, Xiaolong
KNOWLEDGE-BASED SYSTEMS, 2024, 283
[7] A Study of Vietnamese Sentiment Classification with Ensemble Pre-trained Language Models
Thin, Dang Van
Hao, Duong Ngoc
Nguyen, Ngan Luu-Thuy
VIETNAM JOURNAL OF COMPUTER SCIENCE, 2024, 11 (01) : 137 - 165
[8] Evaluating Pre-trained Word Embeddings and Neural Network Architectures for Sentiment Analysis in Spanish Financial Tweets
Antonio Garcia-Diaz, Jose
Apolinario-Arzube, Oscar
Valencia-Garcia, Rafael
ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2020, PT II, 2020, 12469 : 167 - 178
[9] Accuracy of a pre-trained sentiment analysis (SA) classification model on tweets related to emergency response and early recovery assessment: the case of 2019 Albanian earthquake
Contreras, Diana
Wilkinson, Sean
Alterman, Evangeline
Hervas, Javier
NATURAL HAZARDS, 2022, 113 (01) : 403 - 421
[10] Explainable Pre-Trained Language Models for Sentiment Analysis in Low-Resourced Languages
Mabokela, Koena Ronny
Primus, Mpho
Celik, Turgay
BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (11)

← 1 2 3 →