Analysis of longitudinal social media for monitoring symptoms during a pandemic

被引:0
|
作者
Lin, Shixu [1 ]
Garay, Lucas [1 ]
Hua, Yining [2 ,3 ,4 ]
Guo, Zhijiang [5 ]
Li, Wanxin [1 ]
Li, Minghui [1 ]
Zhang, Yujie [1 ]
Xu, Xiaolin [1 ]
Yang, Jie [1 ,6 ,7 ]
机构
[1] Zhejiang Univ, Sch Med, Sch Publ Hlth, Hangzhou 310058, Peoples R China
[2] Harvard TH Chan Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[3] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
[4] Brigham & Womens Hosp, Div Gen Internal Med, Boston, MA 02115 USA
[5] Univ Cambridge, Dept Comp Sci & Technol, Cambridge, England
[6] Brigham & Womens Hosp, Dept Med, Boston, MA 02115 USA
[7] Harvard Med Sch, Boston, MA 02115 USA
关键词
Natural language processing; Deep learning; Social media; Public health; COVID-19; Symptom surveillance; ASSOCIATION; INFECTION;
D O I
10.1016/j.jbi.2025.104778
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: Current studies leveraging social media data for disease monitoring face challenges like noisy colloquial language and insufficient tracking of user disease progression in longitudinal data settings. This study aims to develop a pipeline for collecting, cleaning, and analyzing large-scale longitudinal social media data for disease monitoring, with a focus on COVID-19 pandemic. Materials and methods: This pipeline initiates by screening COVID-19 cases from tweets spanning February 1, 2020, to April 30, 2022. Longitudinal data is collected for each patient, two months before and three months after self-reporting. Symptoms are extracted using Name Entity Recognition (NER), followed by denoising with a combination of Graph Convolutional Network (GCN) and Bidirectional Encoder Representations from Transformers (BERT) model to retain only User-experienced Symptom Mentions (USM). Subsequently, symptoms are mapped to standardized medical concepts using the Unified Medical Language System (UMLS). Finally, this study conducts symptom pattern analysis and visualization to illustrate temporal changes in symptom prevalence and co-occurrence. Results: This study identified 191,096 self-reported COVID-19-positive cases from COVID-19-related tweets and retrospectively collected 811,398,280 historical tweets, of which 2,120,964 contained symptoms information. After denoising, 39 % (832,287) of symptom-sharing tweets reflected user-experienced mentions. The trained USM model achieved an average F1 score of 0.927. Further analysis revealed a higher prevalence of upper respiratory tract symptoms during the Omicron period compared to the Delta and Wild-type periods. Additionally, there was a pronounced co-occurrence of lower respiratory tract and nervous system symptoms in the Wild-type strain and Delta variant. Conclusion: This study established a robust framework for analyzing longitudinal social media data to monitor symptoms during a pandemic. By integrating denoising of user-experienced symptom mentions, our findings reveal the duration of different symptoms over time and by variant within a cohort of nearly 200,000 patients, providing critical insights into symptom trends that are often difficult to capture through traditional data source.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Engagement of Government Social Media on Facebook during the COVID-19 Pandemic in Macao
    Pang, Patrick Cheong-Iao
    Cai, Qixin
    Jiang, Wenjing
    Chan, Kin Sun
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (07)
  • [22] Longitudinal Changes of COVID-19 Symptoms in Social Media: Observational Study
    Sarabadani, Sarah
    Baruah, Gaurav
    Fossat, Yan
    Jeon, Jouhyun
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (02)
  • [23] Social Presence of Ruangguru in Social Media during Covid-19 Pandemic
    Fattah, Raihan Abiyan
    Sujono, Firman Kurniawan
    JURNAL THE MESSENGER, 2020, 12 (02) : 180 - 191
  • [24] Personality trait analysis during the COVID-19 pandemic: a comparative study on social media
    Marcos Fernández-Pichel
    Mario Ezra Aragón
    Julián Saborido-Patiño
    David E. Losada
    Journal of Intelligent Information Systems, 2024, 62 : 117 - 142
  • [25] Personality trait analysis during the COVID-19 pandemic: a comparative study on social media
    Fernandez-Pichel, Marcos
    Aragon, Mario Ezra
    Saborido-Patino, Julian
    Losada, David E.
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2024, 62 (01) : 117 - 142
  • [26] Emerging adults' use of social media and adjustment during the pandemic
    Schwartz, David
    Taylor, Leslie M.
    -Gordon, Wendy Troop
    Omary, Adam
    Ryjova, Yana
    Zhang, Minci
    Chung, Jinsol
    JOURNAL OF APPLIED DEVELOPMENTAL PSYCHOLOGY, 2024, 92
  • [27] Public Opinion Manipulation on Social Media: Social Network Analysis of Twitter Bots during the COVID-19 Pandemic
    Weng, Zixuan
    Lin, Aijun
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (24)
  • [28] Social media for early characterization of pandemic symptoms: A qualitative analysis of patient-reported COVID-19 experiences
    Khashei, Melissa
    Janiczak, Scott
    St Clair, Christopher
    Liu, Wei
    Song, Jae Joon
    Hua, Wei
    Falconer, Monique
    Eworuke, Efe
    PHARMACOEPIDEMIOLOGY AND DRUG SAFETY, 2023, 32 (03) : 341 - 351
  • [29] Prospective Impact of Borderline Personality Disorder Symptoms and Social Media Addiction on Coping and Health Related Outcomes During a Global Pandemic
    Kim L. Gratz
    Julia R. Richmond
    Kayla M. Scamaldo
    Keith A. Edmonds
    Jason P. Rose
    Matthew T. Tull
    International Journal of Cognitive Therapy, 2023, 16 : 571 - 593
  • [30] The beauty and the beast of social media: an interpretative phenomenological analysis of the impact of adolescents' social media experiences on their mental health during the Covid-19 pandemic
    Keles, Betul
    Grealish, Annmarie
    Leamy, Mary
    CURRENT PSYCHOLOGY, 2024, 43 (01) : 96 - 112