Constructing co-occurrence network embeddings to assist association extraction for COVID-19 and other coronavirus infectious diseases

被引:16
作者
Oniani, David [1 ]
Jiang, Guoqian [2 ]
Liu, Hongfang [2 ]
Shen, Feichen [2 ]
机构
[1] Mayo Clin, Kern Ctr Sci Hlth Care Delivery, Rochester, MN 55901 USA
[2] Mayo Clin, Div Digital Hlth Sci, Rochester, MN 55901 USA
关键词
COVID-19; coronavirus INFECTIOUS diseases; co-occurrence network embeddings; association extraction; SARS-COV;
D O I
10.1093/jamia/ocaa117
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: As coronavirus disease 2019 (COVID-19) started its rapid emergence and gradually transformed into an unprecedented pandemic, the need for having a knowledge repository for the disease became crucial. To address this issue, a new COVID-19 machine-readable dataset known as the COVID-19 Open Research Dataset (CORD-19) has been released. Based on this, our objective was to build a computable co-occurrence network embeddings to assist association detection among COVID-19-related biomedical entities. Materials and Methods: Leveraging a Linked Data version of CORD-19 (ie, CORD-19-on-FHIR), we first utilized SPARQL to extract co-occurrences among chemicals, diseases, genes, and mutations and build a co-occurrence network. We then trained the representation of the derived co-occurrence network using node2vec with 4 edge embeddings operations (L1, L2, Average, and Hadamard). Six algorithms (decision tree, logistic regression, support vector machine, random forest, naive Bayes, and multilayer perceptron) were applied to evaluate performance on link prediction. An unsupervised learning strategy was also developed incorporating the t-SNE (t-distributed stochastic neighbor embedding) and DBSCAN (density-based spatial clustering of applications with noise) algorithms for case studies. Results: The random forest classifier showed the best performance on link prediction across different network embeddings. For edge embeddings generated using the Average operation, random forest achieved the optimal average precision of 0.97 along with a F1 score of 0.90. For unsupervised learning, 63 clusters were formed with silhouette score of 0.128. Significant associations were detected for 5 coronavirus infectious diseases in their corresponding subgroups. Conclusions: In this study, we constructed COVID-19-centered co-occurrence network embeddings. Results indicated that the generated embeddings were able to extract significant associations for COVID-19 and coronavirus infectious diseases.
引用
收藏
页码:1259 / 1267
页数:9
相关论文
共 50 条
  • [31] PREDICTORS OF CORONAVIRUS DISEASE 2019 (COVID-19) AMONG PEDIATRIC PATIENTS AT A NATIONAL INFECTIOUS DISEASES HOSPITAL, THAILAND
    Srijareonvijit, Chaisiri
    Manosuthi, Weerawat
    [J]. SOUTHEAST ASIAN JOURNAL OF TROPICAL MEDICINE AND PUBLIC HEALTH, 2021, 52 (05) : 610 - 629
  • [32] The Infectious Diseases Society of America Guidelines on the Diagnosis of Coronavirus Disease 2019 (COVID-19): Molecular Diagnostic Testing
    Hayden, Mary K.
    Hanson, Kimberly E.
    Englund, Janet A.
    Lee, Mark J.
    Loeb, Mark
    Lee, Francesca
    Morgan, Daniel J.
    Patel, Robin
    El Mikati, Ibrahim K.
    Iqneibi, Shahad
    Alabed, Farouk
    Amarin, Justin Z.
    Mansour, Razan
    Patel, Payal
    Falck-Ytter, Yngve
    Morgan, Rebecca L.
    Murad, M. Hassan
    Sultan, Shahnaz
    Bhimraj, Adarsh
    Mustafa, Reem A.
    [J]. CLINICAL INFECTIOUS DISEASES, 2023, 78 (07) : e385 - e415
  • [33] Exploring diet associations with Covid-19 and other diseases: a Network Analysis–based approach
    Rashmeet Toor
    Inderveer Chana
    [J]. Medical & Biological Engineering & Computing, 2022, 60 : 991 - 1013
  • [34] Public Health Measures During the COVID-19 Pandemic Reduce the Spread of Other Respiratory Infectious Diseases
    Hu, Cheng-yi
    Tang, Yu-wen
    Su, Qi-min
    Lei, Yi
    Cui, Wen-shuai
    Zhang, Yan-yan
    Zhou, Yan
    Li, Xin-yan
    Wang, Zhong-fang
    Zhao, Zhu-xiang
    [J]. FRONTIERS IN PUBLIC HEALTH, 2021, 9
  • [35] Evaluation of joint external evaluation to COVID-19 and other infectious diseases mortality outcomes in 96 countries
    Lee, Yuri
    Kim, Siwoo
    Lee, Sieun
    Kim, Min Kyung
    Gostin, Lawrence O.
    Oh, Juhwan
    [J]. INTERNATIONAL HEALTH, 2024,
  • [36] Editorial: Modeling of COVID-19 and other infectious diseases: Mathematical, statistical and biophysical analysis of spread patterns
    El Deeb, Omar
    Hattaf, Khalid
    Kharroubi, Samer A.
    [J]. FRONTIERS IN APPLIED MATHEMATICS AND STATISTICS, 2023, 9
  • [37] Influence of the COVID-19 pandemic measures on incidence and representation of other infectious diseases in Germany: a lesson to be learnt
    Martin Kaatz
    Steffen Springer
    Michael Zieger
    [J]. Journal of Public Health, 2023, 31 : 1673 - 1680
  • [38] Influence of the COVID-19 pandemic measures on incidence and representation of other infectious diseases in Germany: a lesson to be learnt
    Kaatz, Martin
    Springer, Steffen
    Zieger, Michael
    [J]. JOURNAL OF PUBLIC HEALTH-HEIDELBERG, 2023, 31 (10): : 1673 - 1680
  • [39] COVID-19 preventive measures coincided with a marked decline in other infectious diseases in Denmark, spring 2020
    Nielsen, Rikke Thoft
    Dalby, Tine
    Emborg, Hanne-Dorthe
    Larsen, Anders Rhod
    Petersen, Andreas
    Torpdahl, Mia
    Hoffmann, Steen
    Vestergaard, Lasse Skafte
    Valentiner-Branth, Palle
    [J]. EPIDEMIOLOGY AND INFECTION, 2022, 150
  • [40] Assessment of serum ferritin as a biomarker in COVID-19: bystander or participant? Insights by comparison with other infectious and non-infectious diseases
    Kappert, Kai
    Jahic, Amir
    Tauber, Rudolf
    [J]. BIOMARKERS, 2020, 25 (08) : 616 - 625