Analysis of longitudinal social media for monitoring symptoms during a pandemic

被引:0
作者
Lin, Shixu [1 ]
Garay, Lucas [1 ]
Hua, Yining [2 ,3 ,4 ]
Guo, Zhijiang [5 ]
Li, Wanxin [1 ]
Li, Minghui [1 ]
Zhang, Yujie [1 ]
Xu, Xiaolin [1 ]
Yang, Jie [1 ,6 ,7 ]
机构
[1] Zhejiang Univ, Sch Med, Sch Publ Hlth, Hangzhou 310058, Peoples R China
[2] Harvard TH Chan Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[3] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
[4] Brigham & Womens Hosp, Div Gen Internal Med, Boston, MA 02115 USA
[5] Univ Cambridge, Dept Comp Sci & Technol, Cambridge, England
[6] Brigham & Womens Hosp, Dept Med, Boston, MA 02115 USA
[7] Harvard Med Sch, Boston, MA 02115 USA
关键词
Natural language processing; Deep learning; Social media; Public health; COVID-19; Symptom surveillance; ASSOCIATION; INFECTION;
D O I
10.1016/j.jbi.2025.104778
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: Current studies leveraging social media data for disease monitoring face challenges like noisy colloquial language and insufficient tracking of user disease progression in longitudinal data settings. This study aims to develop a pipeline for collecting, cleaning, and analyzing large-scale longitudinal social media data for disease monitoring, with a focus on COVID-19 pandemic. Materials and methods: This pipeline initiates by screening COVID-19 cases from tweets spanning February 1, 2020, to April 30, 2022. Longitudinal data is collected for each patient, two months before and three months after self-reporting. Symptoms are extracted using Name Entity Recognition (NER), followed by denoising with a combination of Graph Convolutional Network (GCN) and Bidirectional Encoder Representations from Transformers (BERT) model to retain only User-experienced Symptom Mentions (USM). Subsequently, symptoms are mapped to standardized medical concepts using the Unified Medical Language System (UMLS). Finally, this study conducts symptom pattern analysis and visualization to illustrate temporal changes in symptom prevalence and co-occurrence. Results: This study identified 191,096 self-reported COVID-19-positive cases from COVID-19-related tweets and retrospectively collected 811,398,280 historical tweets, of which 2,120,964 contained symptoms information. After denoising, 39 % (832,287) of symptom-sharing tweets reflected user-experienced mentions. The trained USM model achieved an average F1 score of 0.927. Further analysis revealed a higher prevalence of upper respiratory tract symptoms during the Omicron period compared to the Delta and Wild-type periods. Additionally, there was a pronounced co-occurrence of lower respiratory tract and nervous system symptoms in the Wild-type strain and Delta variant. Conclusion: This study established a robust framework for analyzing longitudinal social media data to monitor symptoms during a pandemic. By integrating denoising of user-experienced symptom mentions, our findings reveal the duration of different symptoms over time and by variant within a cohort of nearly 200,000 patients, providing critical insights into symptom trends that are often difficult to capture through traditional data source.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Evaluation of a Social Media Campaign in Saskatchewan to Promote Healthy Eating During the COVID-19 Pandemic: Social Media Analysis and Qualitative Interview Study
    Grantham, Jordyn L.
    Verishagen, Carrie L.
    Whiting, Susan J.
    Henry, Carol J.
    Lieffers, Jessica R. L.
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (07)
  • [32] A social media analysis of students' perspective toward online learning during the COVID-19 pandemic
    Sabilla, Kanetasya
    Hartarto, Romi Bhakti
    COGENT SOCIAL SCIENCES, 2024, 10 (01):
  • [33] The Role of Social Media in Health Misinformation and Disinformation During the COVID-19 Pandemic: Bibliometric Analysis
    Adebesin, Funmi
    Smuts, Hanlie
    Mawela, Tendani
    Maramba, George
    Hattingh, Marie
    JMIR INFODEMIOLOGY, 2023, 3 (01):
  • [34] Prospective Impact of Borderline Personality Disorder Symptoms and Social Media Addiction on Coping and Health Related Outcomes During a Global Pandemic
    Gratz, Kim L.
    Richmond, Julia R.
    Scamaldo, Kayla M.
    Edmonds, Keith A.
    Rose, Jason P.
    Tull, Matthew T.
    INTERNATIONAL JOURNAL OF COGNITIVE THERAPY, 2023, 16 (04): : 571 - 593
  • [35] Dialogic Communication During Covid-19 Pandemic: An Analysis on Technoparks' Social Media Usage in Turkey
    Aydogan, Hediye
    CONNECTIST-ISTANBUL UNIVERSITY JOURNAL OF COMMUNICATION SCIENCES, 2021, (60): : 1 - 26
  • [36] Public moral motivation during the COVID-19 pandemic: Analysis of posts on Chinese social media
    Zhao, Liang
    Ding, Xiaojun
    Yu, Feng
    SOCIAL BEHAVIOR AND PERSONALITY, 2020, 48 (11):
  • [37] Social media and the role of libraries during the COVID-19 pandemic
    Harisanty, Dessy
    Sugihartati, Rahma
    Srimulyo, Koko
    MASYARAKAT KEBUDAYAAN DAN POLITIK, 2022, 35 (03) : 351 - 363
  • [38] INFODEMIC MONIKERS IN SOCIAL MEDIA DURING COVID-19 PANDEMIC
    Bhatta, Jeevan
    Sharma, Sharmistha
    Kandel, Shashi
    Nepal, Roshan
    ASIA PACIFIC JOURNAL OF HEALTH MANAGEMENT, 2021, 15 (04): : 95 - 97
  • [39] A Content Analysis of Social Media in Tourism During the Covid-19 Pandemic
    Camarinha, Ana Paula
    Abreu, Antonio Jose
    Angelico, Maria Jose
    da Silva, Amelia Ferreira
    Teixeira, Sandrina
    ADVANCES IN TOURISM, TECHNOLOGY AND SYSTEMS, VOL 1, 2021, 208 : 532 - 546
  • [40] The beauty and the beast of social media: an interpretative phenomenological analysis of the impact of adolescents' social media experiences on their mental health during the Covid-19 pandemic
    Betul Keles
    Annmarie Grealish
    Mary Leamy
    Current Psychology, 2024, 43 : 96 - 112