Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniques

被引:2
|
作者
Elmitwalli, Sherif [1 ]
Mehegan, John [1 ]
机构
[1] Univ Bath, Dept Hlth, Tobacco Control Res Grp, Bath, England
来源
FRONTIERS IN BIG DATA | 2024年 / 7卷
关键词
sentiment analysis; lexicon-based; Bi-LSTM; BERT; GPT-3; COP9; LLMS; SOCIAL MEDIA; TWITTER;
D O I
10.3389/fdata.2024.1357926
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introduction Sentiment analysis has become a crucial area of research in natural language processing in recent years. The study aims to compare the performance of various sentiment analysis techniques, including lexicon-based, machine learning, Bi-LSTM, BERT, and GPT-3 approaches, using two commonly used datasets, IMDB reviews and Sentiment140. The objective is to identify the best-performing technique for an exemplar dataset, tweets associated with the WHO Framework Convention on Tobacco Control Ninth Conference of the Parties in 2021 (COP9).Methods A two-stage evaluation was conducted. In the first stage, various techniques were compared on standard sentiment analysis datasets using standard evaluation metrics such as accuracy, F1-score, and precision. In the second stage, the best-performing techniques from the first stage were applied to partially annotated COP9 conference-related tweets.Results In the first stage, BERT achieved the highest F1-scores (0.9380 for IMDB and 0.8114 for Sentiment 140), followed by GPT-3 (0.9119 and 0.7913) and Bi-LSTM (0.8971 and 0.7778). In the second stage, GPT-3 performed the best for sentiment analysis on partially annotated COP9 conference-related tweets, with an F1-score of 0.8812.Discussion The study demonstrates the effectiveness of pre-trained models like BERT and GPT-3 for sentiment analysis tasks, outperforming traditional techniques on standard datasets. Moreover, the better performance of GPT-3 on the partially annotated COP9 tweets highlights its ability to generalize well to domain-specific data with limited annotations. This provides researchers and practitioners with a viable option of using pre-trained models for sentiment analysis in scenarios with limited or no annotated data across different domains.
引用
收藏
页数:18
相关论文
共 22 条
  • [21] Sentiment Analysis of Insomnia-Related Tweets via a Combination of Transformers Using Dempster-Shafer Theory: Pre- and Peri-COVID-19 Pandemic Retrospective Study
    Maghsoudi, Arash
    Nowakowski, Sara
    Agrawal, Ritwick
    Sharafkhaneh, Amir
    Kunik, Mark E.
    Naik, Aanand
    Xu, Hua
    Razjouyan, Javad
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (12)
  • [22] Comparative analysis of machine learning-based classification models using sentiment classification of tweets related to COVID-19 pandemic
    Gulati, Kamal
    Kumar, S. Saravana
    Boddu, Raja Sarath Kumar
    Sarvakar, Ketan
    Sharma, Dilip Kumar
    Nomani, M. Z. M.
    MATERIALS TODAY-PROCEEDINGS, 2022, 51 : 38 - 41