Diabetes classification application with efficient missing and outliers data handling algorithms

被引:6
作者
Torkey, Hanaa [1 ]
Ibrahim, Elhossiny [1 ]
Hemdan, E. Z. Z. El-Din [1 ]
El-Sayed, Ayman [1 ]
Shouman, Marwa A. [1 ]
机构
[1] Menoufia Univ, Dept Comp Sci & Engn, Fac Elect Engn, Menoufia, Egypt
关键词
Diabetes; Imputation; Anomaly detection; MIoT and wearable devices; Flask; IMPUTATION;
D O I
10.1007/s40747-021-00349-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Communication between sensors spread everywhere in healthcare systems may cause some missing in the transferred features. Repairing the data problems of sensing devices by artificial intelligence technologies have facilitated the Medical Internet of Things (MIoT) and its emerging applications in Healthcare. MIoT has great potential to affect the patient's life. Data collected from smart wearable devices size dramatically increases with data collected from millions of patients who are suffering from diseases such as diabetes. However, sensors or human errors lead to missing some values of the data. The major challenge of this problem is how to predict this value to maintain the data analysis model performance within a good range. In this paper, a complete healthcare system for diabetics has been used, as well as two new algorithms are developed to handle the crucial problem of missed data from MIoT wearable sensors. The proposed work is based on the integration of Random Forest, mean, class' mean, interquartile range (IQR), and Deep Learning to produce a clean and complete dataset. Which can enhance any machine learning model performance. Moreover, the outliers repair technique is proposed based on dataset class detection, then repair it by Deep Learning (DL). The final model accuracy with the two steps of imputation and outliers repair is 97.41% and 99.71% Area Under Curve (AUC). The used healthcare system is a web-based diabetes classification application using flask to be used in hospitals and healthcare centers for the patient diagnosed with an effective fashion.
引用
收藏
页码:237 / 253
页数:17
相关论文
共 35 条
[11]   Wearable systems and mobile applications for diabetes disease management [J].
Georga E.I. ;
Protopappas V.C. ;
Bellos C.V. ;
Fotiadis D.I. .
Health and Technology, 2014, 4 (2) :101-112
[12]  
Kumar PS, 2017, 2017 INTERNATIONAL CONFERENCE ON INFOCOM TECHNOLOGIES AND UNMANNED SYSTEMS (TRENDS AND FUTURE DIRECTIONS) (ICTUS), P508, DOI 10.1109/ICTUS.2017.8286062
[13]   Predictive Methodology for Diabetic Data Analysis in Big Data [J].
Kumar, Saravana N. M. ;
Eswari, T. ;
Sampath, P. ;
Lavanya, S. .
BIG DATA, CLOUD AND COMPUTING CHALLENGES, 2015, 50 :203-208
[14]   Accurate Diabetes Risk Stratification Using Machine Learning: Role of Missing Value and Outliers [J].
Maniruzzaman, Md ;
Rahman, Md Jahanur ;
Al-MehediHasan, Md ;
Suri, Harman S. ;
Abedin, Md Menhazul ;
El-Baz, Ayman ;
Suri, Jasjit S. .
JOURNAL OF MEDICAL SYSTEMS, 2018, 42 (05)
[15]  
Moore, 2017, MISSING DATA IMPUTAT, P208
[16]  
Murali S, 2015, COMPUT CARDIOL CONF, V42, P121, DOI 10.1109/CIC.2015.7408601
[17]   Mean Imputation Techniques for Filling the Missing Observations in Air Pollution Dataset [J].
Noor, M. N. ;
Yahaya, A. S. ;
Ramli, N. A. ;
Al Bakri, A. M. Mustafa .
ADVANCED MATERIALS ENGINEERING AND TECHNOLOGY II, 2014, 594-595 :902-+
[18]   Performance Analysis of Data Mining Classification Techniques to Predict Diabetes [J].
Perveen, Sajida ;
Shahbaz, Muhammad ;
Guergachi, Aziz ;
Keshavjee, Karim .
4TH SYMPOSIUM ON DATA MINING APPLICATIONS (SDMA2016), 2016, 82 :115-121
[19]  
Petrozziello A, 2018, P INT JT C NEUR NETW, DOI [10.1109/IJCNN.2018.8489488, DOI 10.1109/IJCNN.2018.8489488]
[20]  
Phung S, 2019, IEEE ENG MED BIO, P6513, DOI [10.1109/embc.2019.8856760, 10.1109/EMBC.2019.8856760]