Automated domain-specific healthcare knowledge graph curation framework: Subarachnoid hemorrhage as phenotype

被引:36
作者
Malik, Khalid Mahmood [1 ]
Krishnamurthy, Madan [1 ]
Alobaidi, Mazen [1 ]
Hussain, Maqbool [2 ]
Alam, Fakhare [1 ]
Malik, Ghaus [3 ]
机构
[1] Oakland Univ, Dept Comp Sci & Engn, 115 Lib Dr, Rochester, MI 48309 USA
[2] Sejong Univ, Dept Software, Seoul, South Korea
[3] Henry Ford Hosp, Dept Neurosurg, 2799 West Grand Blvd, Detroit, MI 48202 USA
关键词
Knowledge Graph; Ontology; Electronic Health Records; Intracranial Aneurysm; Association Rules; Ensemble Learning; Subarachnoid Hemorrhage Stroke; BASE;
D O I
10.1016/j.eswa.2019.113120
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To derive meaningful insights from voluminous healthcare data, it is essential to convert it into machine understandable knowledge. Currently, machine understandable domain specific healthcare knowledge curation framework does not exist for complex neurological diseases such as subarachnoid hemorrhage stroke. We envisage futuristic clinical decision support systems and tools backed with such knowledge will aide in complex neurological disease prognosis, diagnosis, and treatment. Existing knowledge graphs (KGs) only contain concepts and relationships between them and offer this knowledge to information extraction and knowledge management applications. However, the proposed domain-specific automated KG curation framework enables extraction of concepts, relationships, individual and cohort graphs, and predictive knowledge. By employing ontology-based information extraction, ensemble learning and word embedding based on skip-gram techniques on structured and unstructured data from electronic health records of 1025 patients with an intracranial aneurysm, this paper proposes a novel fully automated framework to curate knowledge graph, consisting of concepts, different hierarchical and non-hierarchical relationships, and predictive rules for prediction of subarachnoid hemorrhage. The evaluation shows that proposed framework achieves 78% precision and 71% recall respectively, for concept extraction from clinical text. Taxonomic relationships evaluation had precision and recall of 68%, and 95%, respectively. Evaluation of knowledge to predict unruptured status using validation dataset shows accuracy, precision, recall, of 73%, 76%, and 90% respectively. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:15
相关论文
共 42 条
  • [1] Linked open data-based framework for automatic biomedical ontology generation
    Alobaidi, Mazen
    Malik, Khalid Mahmood
    Sabra, Susan
    [J]. BMC BIOINFORMATICS, 2018, 19 : 319
  • [2] [Anonymous], 2015, RISK FACT HLTH DIS
  • [3] [Anonymous], 2017, SCI REP
  • [4] [Anonymous], 2019, STAT FACTS
  • [5] [Anonymous], 2019, TRIPLELE ASKG
  • [6] [Anonymous], 2011, API REFERENCE SCIKIT
  • [7] [Anonymous], 2019, DIS ONT
  • [8] [Anonymous], 2018, LEVENSHTEIN DISTANCE
  • [9] [Anonymous], 2019, STROKE INFORM
  • [10] [Anonymous], 2019, WORD2VEC IMPL