Predicting Diabetes Mellitus With Machine Learning Techniques

被引:367
|
作者
Zou, Quan [1 ,2 ]
Qu, Kaiyang [1 ]
Luo, Yamei [3 ]
Yin, Dehui [3 ]
Ju, Ying [4 ]
Tang, Hua [5 ]
机构
[1] Tianjin Univ, Sch Comp Sci & Technol, Tianjin, Peoples R China
[2] Univ Elect Sci & Technol China, Inst Fundamental & Frontier Sci, Chengdu, Sichuan, Peoples R China
[3] Southwest Med Univ, Sch Med Informat & Engn, Luzhou, Peoples R China
[4] Xiamen Univ, Sch Informat Sci & Technol, Xiamen, Peoples R China
[5] Southwest Med Univ, Sch Basic Med, Dept Pathophysiol, Luzhou, Peoples R China
关键词
diabetes mellitus; random forest; decision tree; neural network; machine learning; feature ranking; RANDOM FOREST; FEATURE-SELECTION; DIAGNOSIS; CLASSIFICATION; EXTRACTION; TOOL;
D O I
10.3389/fgene.2018.00515
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Diabetes mellitus is a chronic disease characterized by hyperglycemia. It may cause many complications. According to the growing morbidity in recent years, in 2040, the world's diabetic patients will reach 642 million, which means that one of the ten adults in the future is suffering from diabetes. There is no doubt that this alarming figure needs great attention. With the rapid development of machine learning, machine learning has been applied to many aspects of medical health. In this study, we used decision tree, random forest and neural network to predict diabetes mellitus. The dataset is the hospital physical examination data in Luzhou, China. It contains 14 attributes. In this study, five-fold cross validation was used to examine the models. In order to verity the universal applicability of the methods, we chose some methods that have the better performance to conduct independent test experiments. We randomly selected 68994 healthy people and diabetic patients' data, respectively as training set. Due to the data unbalance, we randomly extracted 5 times data. And the result is the average of these five experiments. In this study, we used principal component analysis (PCA) and minimum redundancy maximum relevance (mRMR) to reduce the dimensionality. The results showed that prediction with random forest could reach the highest accuracy (ACC = 0.8084) when all the attributes were used.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Machine Learning Tree Classifiers in Predicting Diabetes Mellitus
    Vigneswari, D.
    Kumar, N. Komal
    Raj, V. Ganesh
    Gugan, A.
    Vikash, S. R.
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 84 - 87
  • [2] Analysis of Diabetes mellitus using Machine Learning Techniques
    Bhat, Salliah Shafi
    Selvam, Venkatesan
    Ansari, Gufran Ahmad
    Ansari, Mohd Dilshad
    2022 5TH INTERNATIONAL CONFERENCE ON MULTIMEDIA, SIGNAL PROCESSING AND COMMUNICATION TECHNOLOGIES (IMPACT), 2022,
  • [3] Predicting Diabetes Using Machine Learning Techniques
    Kirgil, Elif Nur Haner
    Erkal, Begum
    Ayyildiz, Tulin Ercelebi
    2022 INTERNATIONAL CONFERENCE ON THEORETICAL AND APPLIED COMPUTER SCIENCE AND ENGINEERING (ICTASCE), 2022, : 137 - 141
  • [4] A Comparison of Machine Learning Techniques for the Detection of Type-2 Diabetes Mellitus: Experiences from Bangladesh
    Uddin, Md Jamal
    Ahamad, Md Martuza
    Hoque, Md Nesarul
    Walid, Md Abul Ala
    Aktar, Sakifa
    Alotaibi, Naif
    Alyami, Salem A.
    Kabir, Muhammad Ashad
    Moni, Mohammad Ali
    INFORMATION, 2023, 14 (07)
  • [5] Machine Learning Techniques for Diabetes Classification: A Comparative Study
    Mustafa, Hiri
    Mohamed, Chrayah
    Nabil, Ourdani
    Noura, Aknin
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (09) : 785 - 790
  • [6] Predicting Ion Channels Genes and Their Types With Machine Learning Techniques
    Han, Ke
    Wang, Miao
    Zhang, Lei
    Wang, Ying
    Guo, Mian
    Zh, Ming
    Zhao, Qian
    Zhang, Yu
    Zeng, Nianyin
    Wang, Chunyu
    FRONTIERS IN GENETICS, 2019, 10
  • [7] Applicability of Machine-Learning Techniques in Predicting Customer Defection
    Prasasti, Niken
    Ohwada, Hayato
    2014 1ST INTERNATIONAL SYMPOSIUM ON TECHNOLOGY MANAGEMENT AND EMERGING TECHNOLOGIES (ISTMET 2014), 2014, : 157 - 162
  • [8] Towards a Stacking Ensemble Model for Predicting Diabetes Mellitus using Combination of Machine Learning Techniques
    Alzubaidi, Abdulaziz A.
    Halawani, Sami M.
    Jarrah, Mutasem
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (12) : 348 - 358
  • [9] Machine Learning Refutes Loss of Smell as a Risk Indicator of Diabetes Mellitus
    Loetsch, Joern
    Haehner, Antje
    Schwarz, Peter E. H.
    Tselmin, Sergey
    Hummel, Thomas
    JOURNAL OF CLINICAL MEDICINE, 2021, 10 (21)
  • [10] Diabetes Classification Using Machine Learning Techniques
    Phongying, Methaporn
    Hiriote, Sasiprapa
    COMPUTATION, 2023, 11 (05)