Diabetes classification application with efficient missing and outliers data handling algorithms

被引:6
作者
Torkey, Hanaa [1 ]
Ibrahim, Elhossiny [1 ]
Hemdan, E. Z. Z. El-Din [1 ]
El-Sayed, Ayman [1 ]
Shouman, Marwa A. [1 ]
机构
[1] Menoufia Univ, Dept Comp Sci & Engn, Fac Elect Engn, Menoufia, Egypt
关键词
Diabetes; Imputation; Anomaly detection; MIoT and wearable devices; Flask; IMPUTATION;
D O I
10.1007/s40747-021-00349-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Communication between sensors spread everywhere in healthcare systems may cause some missing in the transferred features. Repairing the data problems of sensing devices by artificial intelligence technologies have facilitated the Medical Internet of Things (MIoT) and its emerging applications in Healthcare. MIoT has great potential to affect the patient's life. Data collected from smart wearable devices size dramatically increases with data collected from millions of patients who are suffering from diseases such as diabetes. However, sensors or human errors lead to missing some values of the data. The major challenge of this problem is how to predict this value to maintain the data analysis model performance within a good range. In this paper, a complete healthcare system for diabetics has been used, as well as two new algorithms are developed to handle the crucial problem of missed data from MIoT wearable sensors. The proposed work is based on the integration of Random Forest, mean, class' mean, interquartile range (IQR), and Deep Learning to produce a clean and complete dataset. Which can enhance any machine learning model performance. Moreover, the outliers repair technique is proposed based on dataset class detection, then repair it by Deep Learning (DL). The final model accuracy with the two steps of imputation and outliers repair is 97.41% and 99.71% Area Under Curve (AUC). The used healthcare system is a web-based diabetes classification application using flask to be used in hospitals and healthcare centers for the patient diagnosed with an effective fashion.
引用
收藏
页码:237 / 253
页数:17
相关论文
共 35 条
[1]  
Ali Zia U, 2017, INT J SCI ENG RES, V8, P1538
[2]   Multiple imputation by chained equations: what is it and how does it work? [J].
Azur, Melissa J. ;
Stuart, Elizabeth A. ;
Frangakis, Constantine ;
Leaf, Philip J. .
INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) :40-49
[3]   Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model [J].
Bartlett, Jonathan W. ;
Seaman, Shaun R. ;
White, Ian R. ;
Carpenter, James R. .
STATISTICAL METHODS IN MEDICAL RESEARCH, 2015, 24 (04) :462-487
[4]   "Deep" Learning for Missing Value Imputation in Tables with Non-Numerical Data [J].
Biessmann, Felix ;
Salinas, David ;
Schelter, Sebastian ;
Schmidt, Philipp ;
Lange, Dustin .
CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, :2017-2025
[5]   Wearable Continuous Glucose Monitoring Sensors: A Revolution in Diabetes Treatment [J].
Cappon, Giacomo ;
Acciaroli, Giada ;
Vettoretti, Martina ;
Facchinetti, Andrea ;
Sparacino, Giovanni .
ELECTRONICS, 2017, 6 (03)
[6]   Smart Clothing: Connecting Human with Clouds and Big Data for Sustainable Health Monitoring [J].
Chen, Min ;
Ma, Yujun ;
Song, Jeungeun ;
Lai, Chin-Feng ;
Hu, Bin .
MOBILE NETWORKS & APPLICATIONS, 2016, 21 (05) :825-845
[7]   A comparison of different methods to handle missing data in the context of propensity score analysis [J].
Choi, Jungyeon ;
Dekkers, Olaf M. ;
le Cessie, Saskia .
EUROPEAN JOURNAL OF EPIDEMIOLOGY, 2019, 34 (01) :23-36
[8]  
Dhainje, 2016, SURVEY PAPER USE DAT, V7301, P11
[9]   Missing data imputation with fuzzy feature selection for diabetes dataset [J].
Dzulkalnine, Mohamad Faiz ;
Sallehuddin, Roselina .
SN APPLIED SCIENCES, 2019, 1 (04)
[10]   An adaptive rule-based classifier for mining big biological data [J].
Farid, Dewan Md ;
Al-Mamun, Mohammad Abdullah ;
Manderick, Bernard ;
Nowe, Ann .
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 64 :305-316