Sentiment analysis of COP9-related tweets: a comparative study of pre-trained models and traditional techniques

被引:2
|
作者
Elmitwalli, Sherif [1 ]
Mehegan, John [1 ]
机构
[1] Univ Bath, Dept Hlth, Tobacco Control Res Grp, Bath, England
来源
FRONTIERS IN BIG DATA | 2024年 / 7卷
关键词
sentiment analysis; lexicon-based; Bi-LSTM; BERT; GPT-3; COP9; LLMS; SOCIAL MEDIA; TWITTER;
D O I
10.3389/fdata.2024.1357926
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Introduction Sentiment analysis has become a crucial area of research in natural language processing in recent years. The study aims to compare the performance of various sentiment analysis techniques, including lexicon-based, machine learning, Bi-LSTM, BERT, and GPT-3 approaches, using two commonly used datasets, IMDB reviews and Sentiment140. The objective is to identify the best-performing technique for an exemplar dataset, tweets associated with the WHO Framework Convention on Tobacco Control Ninth Conference of the Parties in 2021 (COP9).Methods A two-stage evaluation was conducted. In the first stage, various techniques were compared on standard sentiment analysis datasets using standard evaluation metrics such as accuracy, F1-score, and precision. In the second stage, the best-performing techniques from the first stage were applied to partially annotated COP9 conference-related tweets.Results In the first stage, BERT achieved the highest F1-scores (0.9380 for IMDB and 0.8114 for Sentiment 140), followed by GPT-3 (0.9119 and 0.7913) and Bi-LSTM (0.8971 and 0.7778). In the second stage, GPT-3 performed the best for sentiment analysis on partially annotated COP9 conference-related tweets, with an F1-score of 0.8812.Discussion The study demonstrates the effectiveness of pre-trained models like BERT and GPT-3 for sentiment analysis tasks, outperforming traditional techniques on standard datasets. Moreover, the better performance of GPT-3 on the partially annotated COP9 tweets highlights its ability to generalize well to domain-specific data with limited annotations. This provides researchers and practitioners with a viable option of using pre-trained models for sentiment analysis in scenarios with limited or no annotated data across different domains.
引用
收藏
页数:18
相关论文
共 22 条
  • [1] A Comparative Study of Pre-trained Word Embeddings for Arabic Sentiment Analysis
    Zouidine, Mohamed
    Khalil, Mohammed
    2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 1243 - 1248
  • [2] A Comparative Study of Different Pre-trained Language Models for Sentiment Analysis of Human-Computer Negotiation Dialogue
    Dong, Jing
    Luo, Xudong
    Zhu, Junlin
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT IV, KSEM 2024, 2024, 14887 : 301 - 317
  • [3] A Comparative Study on Pre-Trained Models Based on BERT
    Zhang, Minghua
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 326 - 330
  • [4] Pre-trained Word Embeddings for Arabic Aspect-Based Sentiment Analysis of Airline Tweets
    Ashi, Mohammed Matuq
    Siddiqui, Muazzam Ahmed
    Nadeem, Farrukh
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ADVANCED INTELLIGENT SYSTEMS AND INFORMATICS 2018, 2019, 845 : 241 - 251
  • [5] Aspect Based Sentiment Analysis using French Pre-Trained Models
    Essebbar, Abderrahman
    Kane, Bamba
    Guinaudeau, Ophelie
    Chiesa, Valeria
    Quenel, Ilhem
    Chau, Stephane
    ICAART: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2021, : 519 - 525
  • [6] Efficient utilization of pre-trained models: A review of sentiment analysis via prompt learning
    Bu, Kun
    Liu, Yuanchao
    Ju, Xiaolong
    KNOWLEDGE-BASED SYSTEMS, 2024, 283
  • [7] A Study of Vietnamese Sentiment Classification with Ensemble Pre-trained Language Models
    Thin, Dang Van
    Hao, Duong Ngoc
    Nguyen, Ngan Luu-Thuy
    VIETNAM JOURNAL OF COMPUTER SCIENCE, 2024, 11 (01) : 137 - 165
  • [8] Evaluating Pre-trained Word Embeddings and Neural Network Architectures for Sentiment Analysis in Spanish Financial Tweets
    Antonio Garcia-Diaz, Jose
    Apolinario-Arzube, Oscar
    Valencia-Garcia, Rafael
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, MICAI 2020, PT II, 2020, 12469 : 167 - 178
  • [9] Accuracy of a pre-trained sentiment analysis (SA) classification model on tweets related to emergency response and early recovery assessment: the case of 2019 Albanian earthquake
    Contreras, Diana
    Wilkinson, Sean
    Alterman, Evangeline
    Hervas, Javier
    NATURAL HAZARDS, 2022, 113 (01) : 403 - 421
  • [10] Explainable Pre-Trained Language Models for Sentiment Analysis in Low-Resourced Languages
    Mabokela, Koena Ronny
    Primus, Mpho
    Celik, Turgay
    BIG DATA AND COGNITIVE COMPUTING, 2024, 8 (11)