Constructing co-occurrence network embeddings to assist association extraction for COVID-19 and other coronavirus infectious diseases

被引:16
作者
Oniani, David [1 ]
Jiang, Guoqian [2 ]
Liu, Hongfang [2 ]
Shen, Feichen [2 ]
机构
[1] Mayo Clin, Kern Ctr Sci Hlth Care Delivery, Rochester, MN 55901 USA
[2] Mayo Clin, Div Digital Hlth Sci, Rochester, MN 55901 USA
关键词
COVID-19; coronavirus INFECTIOUS diseases; co-occurrence network embeddings; association extraction; SARS-COV;
D O I
10.1093/jamia/ocaa117
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: As coronavirus disease 2019 (COVID-19) started its rapid emergence and gradually transformed into an unprecedented pandemic, the need for having a knowledge repository for the disease became crucial. To address this issue, a new COVID-19 machine-readable dataset known as the COVID-19 Open Research Dataset (CORD-19) has been released. Based on this, our objective was to build a computable co-occurrence network embeddings to assist association detection among COVID-19-related biomedical entities. Materials and Methods: Leveraging a Linked Data version of CORD-19 (ie, CORD-19-on-FHIR), we first utilized SPARQL to extract co-occurrences among chemicals, diseases, genes, and mutations and build a co-occurrence network. We then trained the representation of the derived co-occurrence network using node2vec with 4 edge embeddings operations (L1, L2, Average, and Hadamard). Six algorithms (decision tree, logistic regression, support vector machine, random forest, naive Bayes, and multilayer perceptron) were applied to evaluate performance on link prediction. An unsupervised learning strategy was also developed incorporating the t-SNE (t-distributed stochastic neighbor embedding) and DBSCAN (density-based spatial clustering of applications with noise) algorithms for case studies. Results: The random forest classifier showed the best performance on link prediction across different network embeddings. For edge embeddings generated using the Average operation, random forest achieved the optimal average precision of 0.97 along with a F1 score of 0.90. For unsupervised learning, 63 clusters were formed with silhouette score of 0.128. Significant associations were detected for 5 coronavirus infectious diseases in their corresponding subgroups. Conclusions: In this study, we constructed COVID-19-centered co-occurrence network embeddings. Results indicated that the generated embeddings were able to extract significant associations for COVID-19 and coronavirus infectious diseases.
引用
收藏
页码:1259 / 1267
页数:9
相关论文
共 50 条
  • [21] Trend and Co-occurrence Network of COVID-19 Symptoms From Large-Scale Social Media Data: Infoveillance Study
    Wu, Jiageng
    Wang, Lumin
    Hua, Yining
    Li, Minghui
    Zhou, Li
    Bates, David W.
    Yang, Jie
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [22] Infectious Diseases Society of America Guidelines on the Treatment and Management of Patients With Coronavirus Disease 2019 (COVID-19)
    Bhimraj, Adarsh
    Morgan, Rebecca L.
    Shumaker, Amy Hirsch
    Baden, Lindsey
    Cheng, Vincent Chi Chung
    Edwards, Kathryn M.
    Gallagher, Jason C.
    Gandhi, Rajesh T.
    Muller, William J.
    Nakamura, Mari M.
    O'Horo, John C.
    Shafer, Robert W.
    Shoham, Shmuel
    Murad, M. Hassan
    Mustafa, Reem A.
    Sultan, Shahnaz
    Falck-Ytter, Yngve
    CLINICAL INFECTIOUS DISEASES, 2022, 78 (07) : e250 - e349
  • [23] Case Report: Co-occurrence of Myocarditis and Thrombotic Microangiopathy Limited to the Heart in a COVID-19 Patient
    Menter, Thomas
    Cueni, Nadine
    Gebhard, Eva Caroline
    Tzankov, Alexandar
    FRONTIERS IN CARDIOVASCULAR MEDICINE, 2021, 8
  • [24] Electrocardiographic Findings and Clinical Outcome in Patients with COVID-19 or Other Acute Infectious Respiratory Diseases
    De Vita, Antonio
    Ravenna, Salvatore Emanuele
    Covino, Marcello
    Lanza, Oreste
    Franceschi, Francesco
    Crea, Filippo
    Lanza, Gaetano Antonio
    JOURNAL OF CLINICAL MEDICINE, 2020, 9 (11) : 1 - 12
  • [25] The indirect impact of control measures in COVID-19 pandemic on the incidence of other infectious diseases in China
    Song, Shuangshuang
    Wang, Ping
    Li, Jian
    Nie, Xiuzhen
    Liu, Liyan
    Liu, Shihua
    Yin, Xiuzhi
    Lin, Aiwei
    PUBLIC HEALTH IN PRACTICE, 2022, 4
  • [26] Autoantibodies during infectious diseases: Lessons from malaria applied to COVID-19 and other infections
    Rivera-Correa, Juan
    Rodriguez, Ana
    FRONTIERS IN IMMUNOLOGY, 2022, 13
  • [27] Disinfectants role in the prevention of spreading the COVID-19 and other infectious diseases: The need for functional polymers!
    Kunduru, Konda Reddy
    Kutner, Neta
    Nassar-Marjiya, Eid
    Shaheen-Mualim, Merna
    Rizik, Luna
    Farah, Shady
    POLYMERS FOR ADVANCED TECHNOLOGIES, 2022, 33 (11) : 3853 - 3861
  • [28] An In-Depth Analysis of COVID-19 Symptoms Considering the Co-Occurrence of Symptoms Using Clustering Algorithms
    Benito, Diego Javier
    Robles, Jesus Rufino
    Ramirez, Juan
    Anta, Antonio Fernandez
    Aguilar, Jose
    IEEE ACCESS, 2024, 12 : 127792 - 127804
  • [29] ASSOCIATION BETWEEN COMORBIDITIES AND DISEASE SEVERITY IN COVID-19 PATIENTS OF AN INFECTIOUS DISEASES HOSPITAL IN RUSSIA
    Kaverina, Elena
    Persuad, Taudgirdas
    ARCHIV EUROMEDICA, 2021, 11 (01): : 12 - 14
  • [30] The co-occurrence of SAT, hypophysitis, and Schnitzler syndrome after COVID-19 vaccination: the first described case
    Szklarz, Michal
    Gontarz-Nowak, Katarzyna
    Kieronski, Aleksander
    Golon, Krystian
    Gorny, Jan
    Matuszewski, Wojciech
    Bandurska-Stankiewicz, Elzbieta
    HORMONES-INTERNATIONAL JOURNAL OF ENDOCRINOLOGY AND METABOLISM, 2024, 23 (04): : 735 - 752