Trend and Co-occurrence Network of COVID-19 Symptoms From Large-Scale Social Media Data: Infoveillance Study

被引:6
|
作者
Wu, Jiageng [1 ,2 ,3 ]
Wang, Lumin [1 ,2 ,3 ]
Hua, Yining [4 ,5 ]
Li, Minghui [1 ,2 ,3 ]
Zhou, Li [4 ,5 ]
Bates, David W. [4 ,5 ]
Yang, Jie [1 ,2 ,3 ]
机构
[1] Zhejiang Univ, Sch Publ Hlth, Sch Med, 866 Yuhangtang Rd, Hangzhou 310058, Peoples R China
[2] Zhejiang Univ, Affiliated Hosp 2, Sch Med, 866 Yuhangtang Rd, Hangzhou 310058, Peoples R China
[3] Key Lab Intelligent Prevent Med Zhejiang Prov, Hangzhou, Peoples R China
[4] Harvard Med Sch, Dept Biomed Informat, Boston, MA USA
[5] Brigham & Womens Hosp, Div Gen Internal Med & Primary Care, Boston, MA USA
基金
美国国家卫生研究院;
关键词
social media; network analysis; public health; data mining; COVID-19;
D O I
10.2196/45419
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background: For an emergent pandemic, such as COVID-19, the statistics of symptoms based on hospital data may be biased or delayed due to the high proportion of asymptomatic or mild-symptom infections that are not recorded in hospitals. Meanwhile, the difficulty in accessing large-scale clinical data also limits many researchers from conducting timely research. Objective: Given the wide coverage and promptness of social media, this study aimed to present an efficient workflow to track and visualize the dynamic characteristics and co-occurrence of symptoms for the COVID-19 pandemic from large-scale and long-term social media data. Methods: This retrospective study included 471,553,966 COVID-19-related tweets from February 1, 2020, to April 30, 2022. We curated a hierarchical symptom lexicon for social media containing 10 affected organs/systems, 257 symptoms, and 1808 synonyms. The dynamic characteristics of COVID-19 symptoms over time were analyzed from the perspectives of weekly new cases, overall distribution, and temporal prevalence of reported symptoms. The symptom evolutions between virus strains (Delta and Omicron) were investigated by comparing the symptom prevalence during their dominant periods. A co-occurrence symptom network was developed and visualized to investigate inner relationships among symptoms and affected body systems. Results: This study identified 201 COVID-19 symptoms and grouped them into 10 affected body systems. There was a significant correlation between the weekly quantity of self-reported symptoms and new COVID-19 infections (Pearson correlation coefficient=0.8528; P<.001). We also observed a 1-week leading trend (Pearson correlation coefficient=0.8802; P<.001) between them. The frequency of symptoms showed dynamic changes as the pandemic progressed, from typical respiratory symptoms in the early stage to more musculoskeletal and nervous symptoms in the later stages. We identified the difference in symptoms between the Delta and Omicron periods. There were fewer severe symptoms (coma and dyspnea), more flu-like symptoms (throat pain and nasal congestion), and fewer typical COVID symptoms (anosmia and taste altered) in the Omicron period than in the Delta period (all P<.001). Network analysis revealed co-occurrences among symptoms and systems corresponding to specific disease progressions, including palpitations (cardiovascular) and dyspnea (respiratory), and alopecia (musculoskeletal) and impotence (reproductive). Conclusions: This study identified more and milder COVID-19 symptoms than clinical research and characterized the dynamic symptom evolution based on 400 million tweets over 27 months. The symptom network revealed potential comorbidity risk and prognostic disease progression. These findings demonstrate that the cooperation of social media and a well-designed workflow can depict a holistic picture of pandemic symptoms to complement clinical studies.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Insights from a Large-Scale Discussion on COVID-19 in Collective Intelligence
    Haqbeen, Jawad
    Ito, Takayuki
    Sahab, Sofia
    Sato, Takumi
    Okuhara, Shun
    Hofiani, Murtaza
    2020 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2020), 2020, : 546 - 553
  • [22] Multilevel Deep-Aggregated Boosted Network to Recognize COVID-19 Infection from Large-Scale Heterogeneous Radiographic Data
    Owais, Muhammad
    Lee, Young Won
    Mahmood, Tahir
    Haider, Adnan
    Sultan, Haseeb
    Park, Kang Ryoung
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2021, 25 (06) : 1881 - 1891
  • [23] Physical Distancing and Social Media Use in Emerging Adults and Adults During the COVID-19 Pandemic: Large-scale Cross-sectional and Longitudinal Survey Study
    Woudenberg, Thabo van
    Buijzen, Moniek
    Hendrikx, Roy
    Weert, Julia van
    Putte, Bas van den
    Kroese, Floor
    Bouman, Martine
    Bruin, Marijn de
    Lambooij, Mattijs
    JMIR INFODEMIOLOGY, 2022, 2 (02):
  • [24] Social network analysis of Twitter data from Pakistan during COVID-19
    Batool, Syeda Hina
    Ahmed, Wasim
    Mahmood, Khalid
    Sharif, Ashral
    INFORMATION DISCOVERY AND DELIVERY, 2022, 50 (04) : 353 - 364
  • [25] The Impact of Large-scale Social Restrictions on the Incidence of COVID-19 : A Case Study of Four Provinces in Indonesia
    Suraya, Izza
    Nurmansyah, Mochamad Iqbal
    Rachmawati, Emma
    Al Aufa, Badra
    Koire, Ibrahim Isa
    KESMAS-NATIONAL PUBLIC HEALTH JOURNAL, 2020, 15 (02): : 49 - 53
  • [26] Using Co-occurrence Analysis to Expand Consumer Health Vocabularies from Social Media Data
    Jiang, Ling
    Yang, Christopher C.
    2013 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2013), 2013, : 74 - 81
  • [27] The social life of COVID-19: Early insights from social media monitoring data collected in Poland
    Burzynska, Joanna
    Bartosiewicz, Anna
    Rekas, Magdalena
    HEALTH INFORMATICS JOURNAL, 2020, 26 (04) : 3056 - 3065
  • [28] COS2: Detecting Large-Scale COVID-19 Misinformation in Social Networks
    Xu, Hailu
    Curci, Macro
    Ek, Sophanna
    Liu, Pinchao
    Li, Zhengxiong
    Xu, Shuai
    CLOUD COMPUTING, CLOUD 2021, 2022, 12989 : 97 - 104
  • [29] A Deep Language Model for Symptom Extraction From Clinical Text and its Application to Extract COVID-19 Symptoms From Social Media
    Luo, Xiao
    Gandhi, Priyanka
    Storey, Susan
    Huang, Kun
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (04) : 1737 - 1748
  • [30] Constructing co-occurrence network embeddings to assist association extraction for COVID-19 and other coronavirus infectious diseases
    Oniani, David
    Jiang, Guoqian
    Liu, Hongfang
    Shen, Feichen
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2020, 27 (08) : 1259 - 1267