Enhancing Machine Learning Predictions Through Knowledge Graph Embeddings

被引:1
作者
Llugiqi, Majlinda [1 ]
Ekaputra, Fajar J. [1 ]
Sabou, Marta [1 ]
机构
[1] Vienna Univ Econ & Business, Vienna, Austria
来源
NEURAL-SYMBOLIC LEARNING AND REASONING, PT I, NESY 2024 | 2024年 / 14979卷
关键词
Neurosymbolic AI; Knowledge Graph Embeddings; Machine Learning; Data Augmentation; DISEASE PREDICTION;
D O I
10.1007/978-3-031-71167-1_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite their widespread use, machine learning (ML) methods often exhibit sub-optimal performance. The accuracy of these models is primarily hindered by insufficient training data and poor data quality, with particularly severe consequences in critical areas such as medical diagnosis prediction. Our hypothesis is that enhancing ML pipelines with semantic information such as those available in knowledge graphs (KG) can address these challenges and improve ML prediction accuracy. To that end, we extend the state of the art through a novel approach that uses KG embeddings to augment tabular data in various innovative ways within ML pipelines. Concretely, we introduce and examine several integration techniques of KG embeddings and the influence of KG characteristics on model performance, specifically accuracy and F2 scores. We evaluate our approach with four ML algorithms and two embedding techniques, applied to heart and chronic kidney disease prediction. Our results indicate consistent improvements in model performance across various ML models and tasks, thus confirming our hypothesis, e.g. we increased the F2 score for the KNN from 70% to 82.22%, and the F2 score for SVM from 74.53% to 81.71%, for heart disease prediction.
引用
收藏
页码:279 / 295
页数:17
相关论文
共 38 条
[21]   DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia [J].
Lehmann, Jens ;
Isele, Robert ;
Jakob, Max ;
Jentzsch, Anja ;
Kontokostas, Dimitris ;
Mendes, Pablo N. ;
Hellmann, Sebastian ;
Morsey, Mohamed ;
van Kleef, Patrick ;
Auer, Soeren ;
Bizer, Christian .
SEMANTIC WEB, 2015, 6 (02) :167-195
[22]  
Llugiqi M., 2024, KNOWL GRAPHS NEUR AI
[23]   Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques [J].
Mohan, Senthilkumar ;
Thirumalai, Chandrasegar ;
Srivastava, Gautam .
IEEE ACCESS, 2019, 7 :81542-81554
[24]   Importance of medical data preprocessing in predictive modeling and risk factor discovery for the frailty syndrome [J].
Philipp Hassler, Andreas ;
Menasalvas, Ernestina ;
Jose Garcia-Garcia, Francisco ;
Rodriguez-Manas, Leocadio ;
Holzinger, Andreas .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (1)
[25]  
Pisanelli DomenicoM., 2004, Ontologies in medicine, V102
[26]   Machine-Learning Methods on Noisy and Sparse Data [J].
Poulinakis, Konstantinos ;
Drikakis, Dimitris ;
Kokkinakis, Ioannis W. ;
Spottswood, Stephen Michael .
MATHEMATICS, 2023, 11 (01)
[27]  
Rady El-Houssainy A., 2019, Informatics in Medicine Unlocked, V15, P203, DOI 10.1016/j.imu.2019.100178
[28]   A decision support system for heart disease prediction based upon machine learning [J].
Rani P. ;
Kumar R. ;
Ahmed N.M.O.S. ;
Jain A. .
Journal of Reliable Intelligent Environments, 2021, 7 (03) :263-275
[29]   RDF2Vec: RDF Graph Embeddings for Data Mining [J].
Ristoski, Petar ;
Paulheim, Heiko .
SEMANTIC WEB - ISWC 2016, PT I, 2016, 9981 :498-514
[30]  
Ruiz C., 2024, Adv. Neural Inf. Process. Syst, V36