Mahalanobis Distance Based Multivariate Outlier Detection to Improve Performance of Hypertension Prediction

被引:19
作者
Dashdondov, Khongorzul [1 ]
Kim, Mi-Hye [1 ]
机构
[1] Chungbuk Natl Univ, Chungbuk 28644, South Korea
关键词
KNHANES; Hypertension; Mahalanobis distance; Detection; RF;
D O I
10.1007/s11063-021-10663-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, the incidence of hypertension diseases has increased dramatically, not only among the elderly but also among young people. In this regard, the use of machine learning methods to diagnose the causes of hypertension diseases has increased in recent years. In this article, we have improved the prediction of hypertension detection using Mahalanobis distance-based multivariate outlier removing of Korean national health data named by the KNHANES database. The study identified a variety of risk factors associated with chronic hypertension. Chronic disease is often caused by many factors, not just one. Therefore, it is necessary to study the detection of the disease taking into account complex factors. The paper is divided into two modules. Initially, the data preprocessing step that uses a tree classifier-based feature selection, and to remove multivariate outlier using Mahalanobis distance from KNHANES data. The next module applies the predictive analysis step to detect and prediction of hypertension. In this study, we compare the accuracy, mean standard error (MSE), F1-score, and area under the ROC curve (AUC) for each classification model. The test results show that the proposed RF-MAH algorithm has an accuracy, F1-score, MSE and AUC outcomes of 99.48%, 99.62%, 0.0025 and 99.61%, respectively. Following these, the second-best outcomes of an accuracy rate of 99.51%, MSE of 0.0028, F1-score of 99.58%, and AUC of 99.65% were achieved by XGBoost with the MAH model. The proposed method can be used not only for hypertension but also for the detection of various diseases, such as stroke and cardiovascular disease. It is planned to support the identification and decision-making of high-risk patients with various diseases.
引用
收藏
页码:265 / 277
页数:13
相关论文
共 15 条
[1]   Reconstruction error based deep neural networks for coronary heart disease risk prediction [J].
Amarbayasgalan, Tsatsral ;
Park, Kwang Ho ;
Lee, Jong Yun ;
Ryu, Keun Ho .
PLOS ONE, 2019, 14 (12)
[2]   Aiding the Diagnosis of Diabetic and Hypertensive Retinopathy Using Artificial Intelligence-Based Semantic Segmentation [J].
Arsalan, Muhammad ;
Owais, Muhammad ;
Mahmood, Tahir ;
Cho, Se Woon ;
Park, Kang Ryoung .
JOURNAL OF CLINICAL MEDICINE, 2019, 8 (09)
[3]   A Machine-Learning-Based Prediction Method for Hypertension Outcomes Based on Medical Data [J].
Chang, Wenbing ;
Liu, Yinglai ;
Xiao, Yiyong ;
Yuan, Xinglong ;
Xu, Xingxing ;
Zhang, Siyue ;
Zhou, Shenghan .
DIAGNOSTICS, 2019, 9 (04)
[4]  
dashdondov khongorzul, 2020, [Journal of the Korea Convergence Society, 한국융합학회논문지], V11, P23, DOI 10.15207/JKCS.2020.11.12.023
[5]   Recognition of cooking activities through air quality sensor data for supporting food journaling [J].
Gerina, Federica ;
Massa, Silvia M. ;
Moi, Francesca ;
Reforgiato Recupero, Diego ;
Riboni, Daniele .
HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2020, 10 (01)
[6]  
Khongorzul Dashdondov, 2019, [Journal of the Korea Convergence Society, 한국융합학회논문지], V10, P7, DOI 10.15207/JKCS.2019.10.10.007
[7]   A convergence data model for medical information related to acute myocardial infarction [J].
Lee, Meeyeon ;
Park, Ye-Seul ;
Kim, Myoung-Hee ;
Lee, Jung-Won .
HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2016, 6
[8]   Combinatorial Optimization of Service Order and Overtaking for Demand-Oriented Timetabling in a Single Railway Line [J].
Li, Dewei ;
Ding, Shishun ;
Wang, Yizhen .
JOURNAL OF ADVANCED TRANSPORTATION, 2018,
[9]  
Mi-Hye K, 2020, 12 INT C COMP SCI IT
[10]   A Hybrid Feature Selection Method to Classification and Its Application in Hypertension Diagnosis [J].
Park, Hyun Woo ;
Li, Dingkun ;
Piao, Yongjun ;
Ryu, Keun Ho .
INFORMATION TECHNOLOGY IN BIO- AND MEDICAL INFORMATICS, ITBAM 2017, 2017, 10443 :11-19