Performance evaluation of optimal ensemble learning approaches with PCA and LDA-based feature extraction for heart disease prediction

被引:3
作者
Rabbi, Md. Sakhawat Hossain [1 ]
Bari, Md. Masbahul [1 ]
Debnath, Tanoy [2 ]
Rahman, Anichur [3 ,4 ]
Das, Avik Kumar [1 ]
Hossain, Md. Parvez [1 ]
Muhammad, Ghulam [5 ]
机构
[1] Green Univ Bangladesh, Dept Comp Sci & Engn, Dhaka, Bangladesh
[2] SUNY Stony Brook, Dept Comp Sci, Stony Brook, NY 11794 USA
[3] Univ Dhaka, Natl Inst Text Engn & Res NITER, Dept Comp Sci & Engn, Constituent Inst, Dhaka 1350, Bangladesh
[4] Mawlana Bhashani Sci & Technol Univ, Dept Comp Sci & Engn, Tangail, Bangladesh
[5] King Saud Univ, Coll Comp & Informat Sci, Dept Comp Engn, Riyadh, Saudi Arabia
关键词
Machine learning; Prediction method; Heart disease detection; Ensemble methods; Feature selection; Data balancing and data analysis; SYSTEM;
D O I
10.1016/j.bspc.2024.107138
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Heart disease is a global health concern with a high mortality rate, necessitating early, accurate, and reliable prediction methods for effective prevention and control. In this research, we combine principal component analysis and linear discriminant analysis to reduce dataset complexity and enhance the performance of heart disease classification models by selecting the most relevant features. We address the class imbalance by employing two balancing techniques: oversampling and the synthetic minority oversampling technique, which ensures amore representative dataset, leading to more accurate predictions. Our study develops a novel ensemble approach, utilizing a combination of random forest, support vector machine, K-nearest neighbors, logistic regression, decision tree, and Gaussian naive Bayes to significantly improve heart disease prediction accuracy. Furthermore, we implement advanced ensemble learning techniques, such as Stacking, Bagging, Voting, and Boosting, to achieve early and precise prediction of heart disease. The performance evaluation is conducted on three datasets: Cleveland Heart Disease, Framingham Heart Disease, and Indicators of Heart Disease Dataset (2020), ensuring a robust validation of our methods. The results demonstrate that the voting ensemble machine learning algorithm (VEMLA) achieved 92% accuracy on the Cleveland Heart Disease dataset, while the bagging ensemble machine learning algorithm (BEMLA) achieved 97% accuracy on both the Framingham Heart Disease and Indicators of Heart Disease (2020) datasets. Notably, the proposed BEMLA consistently outperformed other methods, showcasing its superiority in heart disease prediction. This study contributes a comprehensive and effective approach to heart disease diagnosis, outperforming individual classifiers and providing valuable insights for practical clinical applications.
引用
收藏
页数:21
相关论文
共 66 条
[51]   Study on IoT for SARS-CoV-2 with healthcare: present and future perspective [J].
Rahman, Anichur ;
Rahman, Muaz ;
Kundu, Dipanjali ;
Karim, Md Razaul ;
Band, Shahab S. ;
Sookhak, Mehdi .
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2021, 18 (06) :9697-9726
[52]   A comparative study of explainable ensemble learning and logistic regression for predicting in-hospital mortality in the emergency department [J].
Rahmatinejad, Zahra ;
Dehghani, Toktam ;
Hoseini, Benyamin ;
Rahmatinejad, Fatemeh ;
Lotfata, Aynaz ;
Reihani, Hamidreza ;
Eslami, Saeid .
SCIENTIFIC REPORTS, 2024, 14 (01)
[53]   Oversampling method via adaptive double weights and Gaussian kernel function for the transformation of unbalanced data in risk assessment of cardiovascular disease [J].
Rao, Congjun ;
Wei, Xi ;
Xiao, Xinping ;
Shi, Yu ;
Goh, Mark .
INFORMATION SCIENCES, 2024, 665
[54]   AttGRU-HMSI: enhancing heart disease diagnosis using hybrid deep learning approach [J].
Rao, G. Madhukar ;
Ramesh, Dharavath ;
Sharma, Vandana ;
Sinha, Anurag ;
Hassan, Md. Mehedi ;
Gandomi, Amir H. .
SCIENTIFIC REPORTS, 2024, 14 (01)
[55]   Mediating role of coping style on the relationship between job stress and subjective well-being among Korean police officers [J].
Ryu, Gi Wook ;
Yang, Yong Sook ;
Choi, Mona .
BMC PUBLIC HEALTH, 2020, 20 (01)
[56]   A Novel Autonomous Perceptron Model for Pattern Classification Applications [J].
Sagheer, Alaa ;
Zidan, Mohammed ;
Abdelsamea, Mohammed M. .
ENTROPY, 2019, 21 (08)
[57]  
Schneeberger David, 2020, Machine Learning and Knowledge Extraction. 4th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9. International Cross-Domain Conference, CD-MAKE 2020. Proceedings. Lecture Notes in Computer Science (LNCS 12279), P209, DOI 10.1007/978-3-030-57321-8_12
[58]  
Shadab S.A., 2022, Computational Intelligence in Healthcare Applications, P237
[59]   Ensemble Heuristic-Metaheuristic Feature Fusion Learning for Heart Disease Diagnosis Using Tabular Data [J].
Shokouhifar, Mohammad ;
Hasanvand, Mohamad ;
Moharamkhani, Elaheh ;
Werner, Frank .
ALGORITHMS, 2024, 17 (01)
[60]  
Shorewala V, 2021, Inform. Med. Unlock, V26, DOI [10.1016/j.imu.2021.100655, DOI 10.1016/J.IMU.2021.100655]