Enhancing Machine Learning Predictions Through Knowledge Graph Embeddings

被引：0

作者：

Llugiqi, Majlinda ^{[1
]}

Ekaputra, Fajar J. ^{[1
]}

Sabou, Marta ^{[1
]}

机构：

[1] Vienna Univ Econ & Business, Vienna, Austria

来源：

NEURAL-SYMBOLIC LEARNING AND REASONING, PT I, NESY 2024 | 2024年 / 14979卷

关键词：

Neurosymbolic AI; Knowledge Graph Embeddings; Machine Learning; Data Augmentation; DISEASE PREDICTION;

D O I：

10.1007/978-3-031-71167-1_15

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Despite their widespread use, machine learning (ML) methods often exhibit sub-optimal performance. The accuracy of these models is primarily hindered by insufficient training data and poor data quality, with particularly severe consequences in critical areas such as medical diagnosis prediction. Our hypothesis is that enhancing ML pipelines with semantic information such as those available in knowledge graphs (KG) can address these challenges and improve ML prediction accuracy. To that end, we extend the state of the art through a novel approach that uses KG embeddings to augment tabular data in various innovative ways within ML pipelines. Concretely, we introduce and examine several integration techniques of KG embeddings and the influence of KG characteristics on model performance, specifically accuracy and F2 scores. We evaluate our approach with four ML algorithms and two embedding techniques, applied to heart and chronic kidney disease prediction. Our results indicate consistent improvements in model performance across various ML models and tasks, thus confirming our hypothesis, e.g. we increased the F2 score for the KNN from 70% to 82.22%, and the F2 score for SVM from 74.53% to 81.71%, for heart disease prediction.

引用

页码：279 / 295

页数：17

共 38 条

[1] A Hybrid Semantic Knowledgebase-Machine Learning Approach for Opinion Mining
Alfrjani, Rowida
Osman, Taha
Cosma, Georgina
[J]. DATA & KNOWLEDGE ENGINEERING, 2019, 121 : 88 - 108
[2] An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure
Ali, Liaqat
Niamat, Awais
Khan, Javed Ali
Golilarz, Noorbakhsh Amiri
Xiong Xingzhong
Noor, Adeeb
Nour, Redhwan
Bukhari, Syed Ahmad Chan
[J]. IEEE ACCESS, 2019, 7 : 54007 - 54014
[3] Knowledge Graph Semantic Enhancement of Input Data for Improving AI
Bhatt, Shreyansh
Sheth, Amit
Shalin, Valerie
Zhao, Jinjin
[J]. IEEE INTERNET COMPUTING, 2020, 24 (02) : 66 - 72
[4] Ontology Extraction for Large Ontologies via Modularity and Forgetting
Chen, Jieying
Alghamdi, Ghadah
Schmidt, Renate A.
Walther, Dirk
Gao, Yongsheng
[J]. PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON KNOWLEDGE CAPTURE (K-CAP '19), 2019, : 45 - 52
[5] Prediction of Chronic Kidney Disease-A Machine Learning Perspective
Chittora, Pankaj
Chaurasia, Sandeep
Chakrabarti, Prasun
Kumawat, Gaurav
Chakrabarti, Tulika
Leonowicz, Zbigniew
Jasinski, Michal
Jasinski, Lukasz
Gono, Radomir
Jasinska, Elzbieta
Bolshev, Vadim
[J]. IEEE ACCESS, 2021, 9 : 17312 - 17334
[6] Overview of ICD-11 architecture and structure
Chute, Christopher G.
Celik, Can
[J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2022, 21 (SUPPL 6) : 378
[7] Using ontologies to enhance human understandability of global post-hoc explanations of black-box models
Confalonieri, Roberto
Weyde, Tillman
Besold, Tarek R.
Martin, Fermin Moscoso del Prado
[J]. ARTIFICIAL INTELLIGENCE, 2021, 296
[8] A review of some techniques for inclusion of domain-knowledge into deep neural networks
Dash, Tirtharaj
Chitlangia, Sharad
Ahuja, Aditya
Srinivasan, Ashwin
[J]. SCIENTIFIC REPORTS, 2022, 12 (01)
[9] SNOMED CT standard ontology based on the ontology for general medical science
El-Sappagh, Shaker
Franda, Francesco
Ali, Farman
Kwak, Kyung-Sup
[J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2018, 18
[10] Neurosymbolic AI: the 3rd wave
Garcez, Artur d'Avila
Lamb, Luis C.
[J]. ARTIFICIAL INTELLIGENCE REVIEW, 2023, 56 (11) : 12387 - 12406

← 1 2 3 4 →