Mahalanobis Distance Based Multivariate Outlier Detection to Improve Performance of Hypertension Prediction

被引:0
作者
Khongorzul Dashdondov
Mi-Hye Kim
机构
[1] Chungbuk National University,
来源
Neural Processing Letters | 2023年 / 55卷
关键词
KNHANES; Hypertension; Mahalanobis distance; Detection; RF;
D O I
暂无
中图分类号
学科分类号
摘要
In recent years, the incidence of hypertension diseases has increased dramatically, not only among the elderly but also among young people. In this regard, the use of machine learning methods to diagnose the causes of hypertension diseases has increased in recent years. In this article, we have improved the prediction of hypertension detection using Mahalanobis distance-based multivariate outlier removing of Korean national health data named by the KNHANES database. The study identified a variety of risk factors associated with chronic hypertension. Chronic disease is often caused by many factors, not just one. Therefore, it is necessary to study the detection of the disease taking into account complex factors. The paper is divided into two modules. Initially, the data preprocessing step that uses a tree classifier-based feature selection, and to remove multivariate outlier using Mahalanobis distance from KNHANES data. The next module applies the predictive analysis step to detect and prediction of hypertension. In this study, we compare the accuracy, mean standard error (MSE), F1-score, and area under the ROC curve (AUC) for each classification model. The test results show that the proposed RF-MAH algorithm has an accuracy, F1-score, MSE and AUC outcomes of 99.48%, 99.62%, 0.0025 and 99.61%, respectively. Following these, the second-best outcomes of an accuracy rate of 99.51%, MSE of 0.0028, F1-score of 99.58%, and AUC of 99.65% were achieved by XGBoost with the MAH model. The proposed method can be used not only for hypertension but also for the detection of various diseases, such as stroke and cardiovascular disease. It is planned to support the identification and decision-making of high-risk patients with various diseases.
引用
收藏
页码:265 / 277
页数:12
相关论文
共 41 条
  • [1] Silachan K(2014)Imputation of medical data using subspace condition order degree polynomials J Inform Process Syst 10 395-411
  • [2] Tantatsanawong P(2019)A machine-learning-based prediction method for hypertension outcomes based on medical data Diagnostics 9 178-30
  • [3] Chang W(2019)Aiding the diagnosis of diabetic and hypertensive retinopathy using artificial intelligence-based semantic segmentation J Clin Med 8 1446-13
  • [4] Liu Y(2020)Multivariate outlier removing for the risk prediction of gas leakage based methane gas J Korea Converg Soc 11 23-245
  • [5] Xiao Y(2019)OrdinalEncoder based DNN for natural gas leak prediction J Korea Converg Soc 10 7-undefined
  • [6] Yuan X(2019)Reconstruction error based deep neural networks for coronary heart disease risk prediction PLoS ONE 14 e0225991-undefined
  • [7] Xu X(2020)Recognition of cooking activities through air quality sensor data for supporting food journaling Hum Cent Comput Inf Sci 10 27-undefined
  • [8] Zhang S(2016)A convergence data model for medical information related to acute myocardial infarction Hum Cent Comput Inf Sci 6 15-undefined
  • [9] Zhou S(2020)Multiple Kinect based system to monitor and analyze key performance indicators of physical training Hum Cent Comput Inf Sci 10 51-undefined
  • [10] Arsalan M(2020)Advanced technologies in blockchain, machine learning, and Big Data J Inf Process Syst 16 239-undefined