共 22 条
Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniques
被引:2
|作者:
Elmitwalli, Sherif
[1
]
Mehegan, John
[1
]
机构:
[1] Univ Bath, Dept Hlth, Tobacco Control Res Grp, Bath, England
来源:
FRONTIERS IN BIG DATA
|
2024年
/
7卷
关键词:
sentiment analysis;
lexicon-based;
Bi-LSTM;
BERT;
GPT-3;
COP9;
LLMS;
SOCIAL MEDIA;
TWITTER;
D O I:
10.3389/fdata.2024.1357926
中图分类号:
TP [自动化技术、计算机技术];
学科分类号:
0812 ;
摘要:
Introduction Sentiment analysis has become a crucial area of research in natural language processing in recent years. The study aims to compare the performance of various sentiment analysis techniques, including lexicon-based, machine learning, Bi-LSTM, BERT, and GPT-3 approaches, using two commonly used datasets, IMDB reviews and Sentiment140. The objective is to identify the best-performing technique for an exemplar dataset, tweets associated with the WHO Framework Convention on Tobacco Control Ninth Conference of the Parties in 2021 (COP9).Methods A two-stage evaluation was conducted. In the first stage, various techniques were compared on standard sentiment analysis datasets using standard evaluation metrics such as accuracy, F1-score, and precision. In the second stage, the best-performing techniques from the first stage were applied to partially annotated COP9 conference-related tweets.Results In the first stage, BERT achieved the highest F1-scores (0.9380 for IMDB and 0.8114 for Sentiment 140), followed by GPT-3 (0.9119 and 0.7913) and Bi-LSTM (0.8971 and 0.7778). In the second stage, GPT-3 performed the best for sentiment analysis on partially annotated COP9 conference-related tweets, with an F1-score of 0.8812.Discussion The study demonstrates the effectiveness of pre-trained models like BERT and GPT-3 for sentiment analysis tasks, outperforming traditional techniques on standard datasets. Moreover, the better performance of GPT-3 on the partially annotated COP9 tweets highlights its ability to generalize well to domain-specific data with limited annotations. This provides researchers and practitioners with a viable option of using pre-trained models for sentiment analysis in scenarios with limited or no annotated data across different domains.
引用
收藏
页数:18
相关论文