XGBoost Model for Chronic Kidney Disease Diagnosis

被引:442
作者
Ogunleye, Adeola [1 ]
Wang, Qing-Guo [1 ]
机构
[1] Univ Johannesburg, Fac Engn & Built Environm, Inst Intelligent Syst, Auckland Pk, South Africa
基金
新加坡国家研究基金会;
关键词
Diseases; Kidney; Feature extraction; Computational modeling; Sociology; Statistics; Artificial intelligence; Medical diagnosis; chronic kidney disease; artificial intelligence; extreme gradient boosting; clinical decision support system; FUZZY EXPERT-SYSTEM;
D O I
10.1109/TCBB.2019.2911071
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Chronic Kidney Disease (CKD) is a menace that is affecting 10 percent of the world population and 15 percent of the South African population. The early and cheap diagnosis of this disease with accuracy and reliability will save 20,000 lives in South Africa per year. Scientists are developing smart solutions with Artificial Intelligence (AI). In this paper, several typical and recent AI algorithms are studied in the context of CKD and the extreme gradient boosting (XGBoost) is chosen as our base model for its high performance. Then, the model is optimized and the optimal full model trained on all the features achieves a testing accuracy, sensitivity, and specificity of 1.000, 1.000, and 1.000, respectively. Note that, to cover the widest range of people, the time and monetary costs of CKD diagnosis have to be minimized with fewest patient tests. Thus, the reduced model using fewer features is desirable while it should still maintain high performance. To this end, the set-theory based rule is presented which combines a few feature selection methods with their collective strengths. The reduced model using about a half of the original full features performs better than the models based on individual feature selection methods and achieves accuracy, sensitivity and specificity, of 1.000, 1.000, and 1.000, respectively.
引用
收藏
页码:2131 / 2140
页数:10
相关论文
共 40 条
[1]  
Al-Hyari A.Y., 2013, 2013 IEEE JORDAN C A, P1, DOI DOI 10.1109/AEECT.2013.6716440
[2]  
[Anonymous], 2014, P STUD C IEEE ENG SY
[3]  
Arya C, 2016, INT CONF COMP COMMUN
[4]   A Novel Fuzzy-Logic Controller for an Artificial Heart [J].
Basnet, Sudan ;
Venkatraman, Niranjan .
2009 IEEE CONTROL APPLICATIONS CCA & INTELLIGENT CONTROL (ISIC), VOLS 1-3, 2009, :1586-+
[5]  
Bhatia A, 2014, 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), P1131, DOI 10.1109/ICACCI.2014.6968460
[6]  
Birkett A., 2017, How to Deal with Outliers in Your Data
[7]  
Chen J, 2016, IEEE ENG MED BIO, P2287, DOI 10.1109/EMBC.2016.7591186
[8]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794
[9]  
Delahunt CB, 2015, PROCEEDINGS OF THE FIFTH IEEE GLOBAL HUMANITARIAN TECHNOLOGY CONFERENCE GHTC 2015, P393, DOI 10.1109/GHTC.2015.7344002
[10]   Identifying Stages of Kidney Renal Cell Carcinoma by Combining Gene Expression and DNA Methylation Data [J].
Deng, Su-Ping ;
Cao, Shaolong ;
Huang, De-Shuang ;
Wang, Yu-Ping .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2017, 14 (05) :1147-1153