Improving graph embeddings via entity linking: A case study on Italian clinical notes

被引:4
作者
D'Auria, Daniela [1 ]
Moscato, Vincenzo [2 ]
Postiglione, Marco [2 ]
Romito, Giuseppe [2 ]
Sperli, Giancarlo [2 ]
机构
[1] Free Univ Bozen Bolzano, Fac Comp Sci, I-39100 Bozen Bolzano, Italy
[2] Univ Naples Federico II, Dept Elect Engn & Informat Technol DIETI, Via Claudio 21, I-80125 Naples, Italy
来源
INTELLIGENT SYSTEMS WITH APPLICATIONS | 2023年 / 17卷
关键词
Entity linking; Graph embedding; Link prediction; Health analytics; Healthcare;
D O I
10.1016/j.iswa.2022.200161
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The ever-increasing availability of Electronic Health Records (EHRs) is the key enabling factor of precision medicine, , which aims to provide therapies and diagnoses based not only on medical literature, but also on clinical experience and individual information of patients (e.g. genomics, lifestyle, health history). The unstructured nature of EHRs has posed several challenges on their effective analysis, and heterogeneous graphs are the most suitable solution to handle the heterogeneity of information contained in EHRs. However, while EHRs are an extremely valuable data source, information from current medical literature has yet to be considered in clinical decision support systems. In this work, we build an heterogeneous graph from Italian EHRs provided by the Hospital of Naples Federico II, and we define a methodological workflow allowing us to predict the presence of a link between patients and diagnosed diseases. We empirically demonstrate that linking concepts to biomedical ontologies (e.g. UMLS, DBpedia) - which allow us to extract entities and relationships from medical literature - is significantly beneficial to our link-prediction workflow in terms of Area Under the ROC curve (AUC) and Mean Reciprocal Rank (MRR).
引用
收藏
页数:14
相关论文
共 66 条
  • [1] Personalized Medicine and the Power of Electronic Health Records
    Abul-Husn, Noura S.
    Kenny, Eimear E.
    [J]. CELL, 2019, 177 (01) : 58 - 69
  • [2] [Anonymous], 2010, P 23 INT C COMP LING
  • [3] DBpedia: A nucleus for a web of open data
    Auer, Soeren
    Bizer, Christian
    Kobilarov, Georgi
    Lehmann, Jens
    Cyganiak, Richard
    Ives, Zachary
    [J]. SEMANTIC WEB, PROCEEDINGS, 2007, 4825 : 722 - +
  • [4] Predicting scientific research trends based on link prediction in keyword networks
    Behrouzi, Saman
    Sarmoor, Zahra Shafaeipour
    Hajsadeghi, Khosrow
    Kavousi, Kaveh
    [J]. JOURNAL OF INFORMETRICS, 2020, 14 (04)
  • [5] Bhowmik R., 2021, P 12 INT WORKSHOP HL, P28
  • [6] The Unified Medical Language System (UMLS): integrating biomedical terminology
    Bodenreider, O
    [J]. NUCLEIC ACIDS RESEARCH, 2004, 32 : D267 - D270
  • [7] Broscheit Samuel, 2019, P 23 C COMPUTATIONA, P677
  • [8] Mining Health Examination Records-A Graph-Based Approach
    Chen, Ling
    Li, Xue
    Sheng, Quan Z.
    Peng, Wen-Chih
    Bennett, John
    Hu, Hsiao-Yun
    Huang, Nicole
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2016, 28 (09) : 2423 - 2437
  • [9] Chen Z., 2011, Proc. of the 2011 Conf. on Empirical Methods in Natural Language Process, P771
  • [10] Chen Zheng, 2010, Theory and Applications of Categories