A Data-Driven Comparative Analysis of Machine-Learning Models for Familial Hypercholesterolemia Detection

被引：0

作者：

Kocejko, Tomasz ^{[1
]}

机构：

[1] Gdansk Univ Technol, Fac Elect Telecommun & Informat, Dept Biomed Engn, PL-80233 Gdansk, Poland

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 23期

关键词：

machine learning; familial hypercholesterolemia; DLCN; model ensembles; DIAGNOSIS; POPULATION;

D O I：

10.3390/app142311187

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Featured Application The presented study can contribute to increasing the familial hypercholesterolemia classification and may help reduce the number of undiagnosed cases of the disease. Abstract This study presents an assessment of familial hypercholesterolemia (FH) probability using different algorithms (CatBoost, XGBoost, Random Forest, SVM) and its ensembles, leveraging electronic health record data. The primary objective is to explore an enhanced method for estimating FH probability, surpassing the currently recommended Dutch Lipid Clinic Network (DLCN) Score. The models were trained using the largest Polish cohort of patients enrolled in an FH clinic, all of whom underwent genetic testing for FH-associated mutations. The initial dataset comprised over 100 parameters per patient, which was reduced to 48 clinically accessible features to ensure applicability in routine outpatient settings. To preserve balance, the data were stratified according to DLCN score ranges (<0-2>, <3-5>, <6-8>, and >= 9), representing varying levels of FH likelihood. The dataset was then split into training and test sets with an 80/20 ratio. Machine-learning models were trained, with hyperparameters optimized via grid search. The accuracy of the DLCN score in predicting FH was first evaluated by examining the proportion of patients with positive DNA tests relative to those with a DLCN score of 6 and above, the threshold for genetic testing. The DLCN score demonstrated an accuracy of approximately 40%. In contrast, the CatBoost model and its ensembles achieved over 80% accuracy. While the DLCN score remains a clinically valuable tool, its diagnostic accuracy is limited. The findings indicate that the ML models offer a substantial improvement in the precision of FH diagnosis, demonstrating its potential to enhance clinical decision making in identifying patients with FH.

引用

页数：13

共 50 条

[1] ANALYSIS OF PIEZOELECTRIC SEMICONDUCTORS VIA DATA-DRIVEN MACHINE-LEARNING TECHNIQUES
Guo, Yu-ting
Li, De-zhi
Zhang, Chun-li
PROCEEDINGS OF THE 2020 15TH SYMPOSIUM ON PIEZOELECTRCITY, ACOUSTIC WAVES AND DEVICE APPLICATIONS (SPAWDA), 2021, : 258 - 262
[2] Damage Detection with Data-Driven Machine Learning Models on an Experimental Structure
Alemu, Yohannes L.
Lahmer, Tom
Walther, Christian
ENG, 2024, 5 (02): : 629 - 656
[3] Data-Driven Machine-Learning Methods for Diabetes Risk Prediction
Dritsas, Elias
Trigka, Maria
SENSORS, 2022, 22 (14)
[4] Applications of machine learning in familial hypercholesterolemia
Luo, Ren-Fei
Wang, Jing-Hui
Hu, Li-Juan
Fu, Qing-An
Zhang, Si-Yi
Jiang, Long
FRONTIERS IN CARDIOVASCULAR MEDICINE, 2023, 10
[5] Personalized Tourist Recommender System: A Data-Driven and Machine-Learning Approach
Shrestha, Deepanjal
Tan, Wenan
Shrestha, Deepmala
Rajkarnikar, Neesha
Jeong, Seung-Ryul
COMPUTATION, 2024, 12 (03)
[6] Data-driven machine-learning analysis of potential embolic sources in embolic stroke of undetermined source
Ntaios, G.
Weng, S. F.
Perlepe, K.
Akyea, R.
Condon, L.
Lambrou, D.
Sirimarco, G.
Strambo, D.
Eskandari, A.
Karagkiozi, E.
Vemmou, A.
Korompoki, E.
Manios, E.
Makaritsis, K.
Vemmos, K.
Michel, P.
EUROPEAN JOURNAL OF NEUROLOGY, 2021, 28 (01) : 192 - 201
[7] Machine learning technique for data-driven fault detection of nonlinear processes
Said, Maroua
ben Abdellafou, Khaoula
Taouali, Okba
JOURNAL OF INTELLIGENT MANUFACTURING, 2020, 31 (04) : 865 - 884
[8] Data-Driven Blood Glucose Pattern Classification and Anomalies Detection: Machine-Learning Applications in Type 1 Diabetes
Woldaregay, Ashenafi Zebene
Arsand, Eirik
Botsis, Taxiarchis
Albers, David
Mamykina, Lena
Hartvigsen, Gunnar
JOURNAL OF MEDICAL INTERNET RESEARCH, 2019, 21 (05)
[9] Primary care clinician engagement in implementing a machine-learning algorithm for targeted screening of familial hypercholesterolemia
Kim, Kain
Faruque, Samir C.
Kulp, David
Lam, Shivani
Sperling, Laurence S.
Eapen, Danny J.
AMERICAN JOURNAL OF PREVENTIVE CARDIOLOGY, 2024, 19
[10] Comparative Investigation of Traditional Machine-Learning Models and Transformer Models for Phishing Email Detection
Melendez, Rene
Ptaszynski, Michal
Masui, Fumito
ELECTRONICS, 2024, 13 (24):

← 1 2 3 4 5 →