Diabetes classification application with efficient missing and outliers data handling algorithms

被引:0
作者
Hanaa Torkey
Elhossiny Ibrahim
EZZ El-Din Hemdan
Ayman El-Sayed
Marwa A. Shouman
机构
[1] Faculty of Electronic Engineering,Department of Computer Science and Engineering
[2] Menoufia University,undefined
来源
Complex & Intelligent Systems | 2022年 / 8卷
关键词
Diabetes; Imputation; Anomaly detection; MIoT and wearable devices; Flask;
D O I
暂无
中图分类号
学科分类号
摘要
Communication between sensors spread everywhere in healthcare systems may cause some missing in the transferred features. Repairing the data problems of sensing devices by artificial intelligence technologies have facilitated the Medical Internet of Things (MIoT) and its emerging applications in Healthcare. MIoT has great potential to affect the patient's life. Data collected from smart wearable devices size dramatically increases with data collected from millions of patients who are suffering from diseases such as diabetes. However, sensors or human errors lead to missing some values of the data. The major challenge of this problem is how to predict this value to maintain the data analysis model performance within a good range. In this paper, a complete healthcare system for diabetics has been used, as well as two new algorithms are developed to handle the crucial problem of missed data from MIoT wearable sensors. The proposed work is based on the integration of Random Forest, mean, class' mean, interquartile range (IQR), and Deep Learning to produce a clean and complete dataset. Which can enhance any machine learning model performance. Moreover, the outliers repair technique is proposed based on dataset class detection, then repair it by Deep Learning (DL). The final model accuracy with the two steps of imputation and outliers repair is 97.41% and 99.71% Area Under Curve (AUC). The used healthcare system is a web-based diabetes classification application using flask to be used in hospitals and healthcare centers for the patient diagnosed with an effective fashion.
引用
收藏
页码:237 / 253
页数:16
相关论文
共 34 条
[1]  
Georga EI(2014)Wearable systems and mobile applications for diabetes disease management Health Technol (Berl) 4 101-112
[2]  
Protopappas VC(2016)Smart clothing: connecting human with clouds and big data for sustainable health monitoring Mob Networks Appl 21 825-845
[3]  
Bellos CV(2017)Random forest missing data algorithms Stat Anal Data Min 10 363-377
[4]  
Fotiadis DI(2011)A new data imputing algorithm Int J Comput Sci Issues 8 133-139
[5]  
Chen M(2015)Multiple imputation of covariates by fully conditional specification: accommodating the substantive model Stat Methods Med Res 24 462-487
[6]  
Ma Y(2019)A comparison of different methods to handle missing data in the context of propensity score analysis Eur J Epidemiol 34 23-36
[7]  
Song J(2016)Big-data Clinical Trial Column Missing data imputation: focusing on single imputation Ann Transl Med 4 8-121
[8]  
Lai CF(2016)Performance analysis of data mining classification techniques to predict diabetes Procedia Comput Sci 82 115-316
[9]  
Hu B(2016)An adaptive rule-based classifier for mining big biological data Expert Syst Appl 64 305-1551
[10]  
Tang F(2017)Predicting diabetes in medical datasets using machine learning techniques Int J. Sci Eng Res 8 1538-603