ADAPTATION OF DOMAIN-SPECIFIC TRANSFORMER MODELS WITH TEXT OVERSAMPLING FOR SENTIMENT ANALYSIS OF SOCIAL MEDIA POSTS ON COVID-19 VACCINE

被引:1
|
作者
Bansal, Anmol [1 ]
Choudhry, Arjun [1 ]
Sharma, Anubhav [1 ]
Susan, Seba [1 ]
机构
[1] Delhi Technol Univ, New Delhi, India
来源
COMPUTER SCIENCE-AGH | 2023年 / 24卷 / 02期
关键词
Covid-19; vaccine; transformer; Twitter; BERTweet; CT-BERT; BERT; XLNet; RoBERTa; text oversampling; LMOTE; class imbalance; small sample data set; TWITTER;
D O I
10.7494/csci.2023.24.2.4761
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Covid-19 has spread across the world, and several vaccines have been developed to counter its surge. To identify the correct sentiments that are associated with the vaccines from social media posts, we fine-tune various state-of-the-art pre -trained transformer models on tweets that are associated with Covid-19 vac-cines. Specifically, we use the recently introduced state-of-the-art RoBERTa, XLNet, and BERT pre-trained transformer models, and the domain-specific CT-BERT and BERTweet transformer models that have been pre-trained on Covid-19 tweets. We further explore the option of text augmentation by over -sampling using the language model-based oversampling technique (LMOTE) to improve the accuracies of these models - specifically, for small sample data sets where there is an imbalanced class distribution among the positive, nega-tive, and neutral sentiment classes. Our results summarize our findings on the suitability of text oversampling for imbalanced small-sample data sets that are used to fine-tune state-of-the-art pre-trained transformer models as well as the utility of domain-specific transformer models for the classification task.
引用
收藏
页码:167 / 186
页数:20
相关论文
共 50 条
  • [22] Using Social Media in Tourist Sentiment Analysis: A Case Study of Andalusia during the Covid-19 Pandemic
    Flores-Ruiz, David
    Elizondo-Salto, Adolfo
    de la O Barroso-Gonzalez, Maria
    SUSTAINABILITY, 2021, 13 (07)
  • [23] Health Communication on COVID-19 Vaccine Through News Media Text Analysis in China
    Liu, Qian
    Yang, Yiwei
    Jia, Miaoyutian
    Chen, Jiawei
    Li, Long
    Shen, Yuanqing
    JOURNAL OF CONSUMER HEALTH ON THE INTERNET, 2024, 28 (04) : 338 - 359
  • [24] Sentiment Analysis toward the COVID-19 Vaccine in the Main Latin American Media on Twitter: The Cases of Argentina, Chile, Colombia, Mexico, and Peru
    Cordoba-Cabus, Alba
    Garcia-Borrego, Manuel
    Ceballos, Yaiza
    VACCINES, 2023, 11 (10)
  • [25] Sentiment Analysis on Multimodal Transportation during the COVID-19 Using Social Media Data
    Chen, Xu
    Wang, Zihe
    Di, Xuan
    INFORMATION, 2023, 14 (02)
  • [26] Sentiment analysis tracking of COVID-19 vaccine through tweets
    Sarirete, Akila
    JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2022, 14 (11) : 14661 - 14669
  • [27] Sentiment analysis and topic modeling for COVID-19 vaccine discussions
    Yin, Hui
    Song, Xiangyu
    Yang, Shuiqiao
    Li, Jianxin
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2022, 25 (03): : 1067 - 1083
  • [28] Mining the Characteristics of COVID-19 Patients in China: Analysis of Social Media Posts
    Huang, Chunmei
    Xu, Xinjie
    Cai, Yuyang
    Ge, Qinmin
    Zeng, Guangwang
    Li, Xiaopan
    Zhang, Weide
    Ji, Chen
    Yang, Ling
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (05)
  • [29] A Transformer-Based Model for Evaluation of Information Relevance in Online Social-Media: A Case Study of Covid-19 Media Posts
    Sharma, Utkarsh
    Pandey, Prateek
    Kumar, Shishir
    NEW GENERATION COMPUTING, 2022, 40 (04) : 1029 - 1052
  • [30] A Transformer-Based Model for Evaluation of Information Relevance in Online Social-Media: A Case Study of Covid-19 Media Posts
    Utkarsh Sharma
    Prateek Pandey
    Shishir Kumar
    New Generation Computing, 2022, 40 : 1029 - 1052