PREDICTION OF TYPE 2 DIABETES MELLITUS USING FEATURE SELECTION-BASED MACHINE LEARNING ALGORITHMS

被引:2
作者
Yilmaz, Atinc [1 ,2 ]
机构
[1] Beykent Univ, Dept Comp Engn, Istanbul, Turkey
[2] Beykent Univ, Dept Comp Engn, Hadim Koruyolu Cd 19, TR-34398 Istanbul, Turkey
关键词
feature selection; health information system; type; 2; diabetes; machine learning; nursing care; RISK; MODEL;
D O I
10.5114/hpc.2022.114541
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Background. The aim of this study is to develop and evaluate a machine learning model for the early diagnosis of type 2 diabetes to allow for treatments to be applied in the early stages of the disease.Material and methods. A proposed hybrid machine learning model was developed and applied to the Early-stage diabetes risk prediction dataset from the UCI database. The prediction success of the proposed model was compared with other machine learning models. Pearson's correlation and SelectKBest feature selection methods were employed to examine the relationships between the dataset input parameters and the results.Results. Of the 520 patients included in the dataset, 320 were diagnosed with diabetes and 328 (63.08%) were males. The most commonly observed diabetes diagnosis criterion was obesity (n=482, 83.08%). While the strongest feature detected with Pearson's correlation was polyuria, the strongest feature detected with SelectKBest was polydipsia. With Pearson's feature extraction, the most successful machine learning method was the proposed hybrid method, with an accuracy of 97.28%. Using SelectKBest feature selection, the same model was able to predict type 2 diabetes with accuracy of 95.16%.Conclusions. Early detection of type 2 diabetes will allow for a prompter and more effective treatment of the patient. Thus, use of the proposed model may help to improve the quality of patient care and lower the number of deaths caused by this disease.
引用
收藏
页码:128 / 139
页数:12
相关论文
共 37 条
[1]   Machine Learning and Health Care Disparities in Dermatology [J].
Adamson, Adewole S. ;
Smith, Avery .
JAMA DERMATOLOGY, 2018, 154 (11) :1247-1248
[2]  
Alehegn M., 2018, INT J PURE APPL MATH, V118, P871
[3]   GENERALIZED RANDOM FORESTS [J].
Athey, Susan ;
Tibshirani, Julie ;
Wager, Stefan .
ANNALS OF STATISTICS, 2019, 47 (02) :1148-1178
[4]  
Atis S, 2020, HITIT MEDICAL J, V2, P262
[5]  
Baran O, 2020, J BAKENT U FACULTY H, V33, P226
[6]   A novel hybrid model of Bagging-based Naive Bayes Trees for landslide susceptibility assessment [J].
Binh Thai Pham ;
Prakash, Indra .
BULLETIN OF ENGINEERING GEOLOGY AND THE ENVIRONMENT, 2019, 78 (03) :1911-1925
[7]   Machine Learning for the Prediction of New-Onset Diabetes Mellitus during 5-Year Follow-up in Non-Diabetic Patients with Cardiovascular Risks [J].
Choi, Byoung Geol ;
Rha, Seung-Woon ;
Kim, Suhng Wook ;
Kang, Jun Hyuk ;
Park, Ji Young ;
Noh, Yung-Kyun .
YONSEI MEDICAL JOURNAL, 2019, 60 (02) :191-199
[8]   Machine Learning Tools for Long-Term Type 2 Diabetes Risk Prediction [J].
Fazakis, Nikos ;
Kocsis, Otilia ;
Dritsas, Elias ;
Alexiou, Sotiris ;
Fakotakis, Nikos ;
Moustakas, Konstantinos .
IEEE ACCESS, 2021, 9 :103737-103757
[9]   Early-Stage Risk Prediction of Non-Communicable Disease Using Machine Learning in Health CPS [J].
Ferdousi, Rahatara ;
Hossain, M. Anwar ;
Saddik, Abdulmotaleb El .
IEEE ACCESS, 2021, 9 :96823-96837
[10]  
Han Wu, 2018, Informatics in Medicine Unlocked, V10, P100, DOI 10.1016/j.imu.2017.12.006