Analysis of longitudinal social media for monitoring symptoms during a pandemic

被引:0
作者
Lin, Shixu [1 ]
Garay, Lucas [1 ]
Hua, Yining [2 ,3 ,4 ]
Guo, Zhijiang [5 ]
Li, Wanxin [1 ]
Li, Minghui [1 ]
Zhang, Yujie [1 ]
Xu, Xiaolin [1 ]
Yang, Jie [1 ,6 ,7 ]
机构
[1] Zhejiang Univ, Sch Med, Sch Publ Hlth, Hangzhou 310058, Peoples R China
[2] Harvard TH Chan Sch Publ Hlth, Dept Epidemiol, Boston, MA 02115 USA
[3] Harvard Med Sch, Dept Biomed Informat, Boston, MA 02115 USA
[4] Brigham & Womens Hosp, Div Gen Internal Med, Boston, MA 02115 USA
[5] Univ Cambridge, Dept Comp Sci & Technol, Cambridge, England
[6] Brigham & Womens Hosp, Dept Med, Boston, MA 02115 USA
[7] Harvard Med Sch, Boston, MA 02115 USA
关键词
Natural language processing; Deep learning; Social media; Public health; COVID-19; Symptom surveillance; ASSOCIATION; INFECTION;
D O I
10.1016/j.jbi.2025.104778
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: Current studies leveraging social media data for disease monitoring face challenges like noisy colloquial language and insufficient tracking of user disease progression in longitudinal data settings. This study aims to develop a pipeline for collecting, cleaning, and analyzing large-scale longitudinal social media data for disease monitoring, with a focus on COVID-19 pandemic. Materials and methods: This pipeline initiates by screening COVID-19 cases from tweets spanning February 1, 2020, to April 30, 2022. Longitudinal data is collected for each patient, two months before and three months after self-reporting. Symptoms are extracted using Name Entity Recognition (NER), followed by denoising with a combination of Graph Convolutional Network (GCN) and Bidirectional Encoder Representations from Transformers (BERT) model to retain only User-experienced Symptom Mentions (USM). Subsequently, symptoms are mapped to standardized medical concepts using the Unified Medical Language System (UMLS). Finally, this study conducts symptom pattern analysis and visualization to illustrate temporal changes in symptom prevalence and co-occurrence. Results: This study identified 191,096 self-reported COVID-19-positive cases from COVID-19-related tweets and retrospectively collected 811,398,280 historical tweets, of which 2,120,964 contained symptoms information. After denoising, 39 % (832,287) of symptom-sharing tweets reflected user-experienced mentions. The trained USM model achieved an average F1 score of 0.927. Further analysis revealed a higher prevalence of upper respiratory tract symptoms during the Omicron period compared to the Delta and Wild-type periods. Additionally, there was a pronounced co-occurrence of lower respiratory tract and nervous system symptoms in the Wild-type strain and Delta variant. Conclusion: This study established a robust framework for analyzing longitudinal social media data to monitor symptoms during a pandemic. By integrating denoising of user-experienced symptom mentions, our findings reveal the duration of different symptoms over time and by variant within a cohort of nearly 200,000 patients, providing critical insights into symptom trends that are often difficult to capture through traditional data source.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Does perceived social cohesion moderate the effect of parental stressors on depressive symptoms? A longitudinal, multi-level analysis before and during the COVID-19 pandemic
    Alaze, Anita
    Heidinger, Ellen
    Razum, Oliver
    Sauzet, Odile
    JOURNAL OF MENTAL HEALTH, 2025,
  • [42] Longitudinal analysis of depressive symptoms among LGBTQ youth at a social media-free camp
    Gillig, Traci K.
    JOURNAL OF GAY & LESBIAN MENTAL HEALTH, 2020, 24 (04) : 360 - 374
  • [43] Concerns Expressed by Chinese Social Media Users During the COVID-19 Pandemic: Content Analysis of Sina Weibo Microblogging Data
    Wang, Junze
    Zhou, Ying
    Zhang, Wei
    Evans, Richard
    Zhu, Chengyan
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (11)
  • [44] The role of social media during the COVID-19 pandemic: Salvaging its 'power' for positive social behaviour change in Africa
    Madziva, Roda
    Nachipo, Brian
    Musuka, Godfrey
    Chitungo, Itai
    Murewanhema, Grant
    Phiri, Bright
    Dzinamarira, Tafadzwa
    HEALTH PROMOTION PERSPECTIVES, 2022, 12 (01): : 22 - 27
  • [45] Combined benefits of active and passive social media during the COVID-19 pandemic: a health perspective
    So, Bohee
    Kwon, Ki Han
    GLOBAL KNOWLEDGE MEMORY AND COMMUNICATION, 2024,
  • [46] Drinking and Social Media Use Among Workers During COVID-19 Pandemic Restrictions: Five-Wave Longitudinal Study
    Oksanen, Atte
    Oksa, Reetta
    Savela, Nina
    Celuch, Magdalena
    Savolainen, Iina
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2021, 23 (12)
  • [47] The unsanitary other and racism during the pandemic: analysis of purity discourses on social media in India, France and United States of America during the COVID-19 pandemic
    Desmarais, Christian
    Roy, Melissa
    Nguyen, Minh Thi
    Venkatesh, Vivek
    Rousseau, Cecile
    ANTHROPOLOGY & MEDICINE, 2023, 30 (01) : 31 - 47
  • [48] Breaching Learners' Social Distancing through Social Media during the COVID-19 Pandemic
    Asghar, Muhammad Zaheer
    Iqbal, Ayesha
    Seitamaa-Hakkarainen, Pirita
    Barbera, Elena
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (21)
  • [49] Early Literature on Adolescent Social Media Use, Substance Use, and Depressive Symptoms During the COVID-19 Pandemic: A Scoping Review
    Delawalla, Miranda L. M.
    Tiwari, Ruchi
    Evans, Yolanda N.
    Rhew, Isaac C.
    Enquobahrie, Daniel A.
    CURRENT PEDIATRICS REPORTS, 2024, 12 (02) : 11 - 23
  • [50] Public information officers' use of social media monitoring: An updated analysis of current practice
    White, Carla
    Luttman, Shelby
    Avery, Elizabeth Johnson
    PUBLIC RELATIONS REVIEW, 2025, 51 (01)