Ensemble Learning for Disease Prediction: A Review

被引:67
作者
Mahajan, Palak [1 ]
Uddin, Shahadat [2 ]
Hajati, Farshid [1 ]
Moni, Mohammad Ali [3 ]
机构
[1] Victoria Univ, Coll Engn & Sci, Sydney, NSW 2000, Australia
[2] Univ Sydney, Fac Engn, Sch Project Management, Forest Lodge, NSW 2037, Australia
[3] Univ Queensland, Fac Hlth & Behav Sci, Sch Hlth & Rehabil Sci, St Lucia, Qld 4072, Australia
关键词
machine learning; bagging; boosting; stacking; voting; disease prediction; MODEL;
D O I
10.3390/healthcare11121808
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Machine learning models are used to create and enhance various disease prediction frameworks. Ensemble learning is a machine learning technique that combines multiple classifiers to improve performance by making more accurate predictions than a single classifier. Although numerous studies have employed ensemble approaches for disease prediction, there is a lack of thorough assessment of commonly used ensemble approaches against highly researched diseases. Consequently, this study aims to identify significant trends in the performance accuracies of ensemble techniques (i.e., bagging, boosting, stacking, and voting) against five hugely researched diseases (i.e., diabetes, skin disease, kidney disease, liver disease, and heart conditions). Using a well-defined search strategy, we first identified 45 articles from the current literature that applied two or more of the four ensemble approaches to any of these five diseases and were published in 2016-2023. Although stacking has been used the fewest number of times (23) compared with bagging (41) and boosting (37), it showed the most accurate performance the most times (19 out of 23). The voting approach is the second-best ensemble approach, as revealed in this review. Stacking always revealed the most accurate performance in the reviewed articles for skin disease and diabetes. Bagging demonstrated the best performance for kidney disease (five out of six times) and boosting for liver and diabetes (four out of six times). The results show that stacking has demonstrated greater accuracy in disease prediction than the other three candidate algorithms. Our study also demonstrates variability in the perceived performance of different ensemble approaches against frequently used disease datasets. The findings of this work will assist researchers in better understanding current trends and hotspots in disease prediction models that employ ensemble learning, as well as in determining a more suitable ensemble model for predictive disease analytics. This article also discusses variability in the perceived performance of different ensemble approaches against frequently used disease datasets.
引用
收藏
页数:21
相关论文
共 68 条
  • [1] Abdollahi J., 2022, IRAN J COMPUTER SCI, V5, P205, DOI [10.1007/s42044-022-00100-1, DOI 10.1007/S42044-022-00100-1]
  • [2] The PRISMA 2020 statement: An updated guideline for reporting systematic reviews
    Page, Matthew J.
    McKenzie, Joanne E.
    Bossuyt, Patrick M.
    Boutron, Isabelle
    Hoffmann, Tammy C.
    Mulrow, Cynthia D.
    Shamseer, Larissa
    Tetzlaff, Jennifer M.
    Akl, Elie A.
    Brennan, Sue E.
    Chou, Roger
    Glanville, Julie
    Grimshaw, Jeremy M.
    Hrobjartsson, Asbjorn
    Lalu, Manoj M.
    Li, Tianjing
    Loder, Elizabeth W.
    Mayo-Wilson, Evan
    McDonald, Steve
    McGuinness, Luke A.
    Stewart, Lesley A.
    Thomas, James
    Tricco, Andrea C.
    Welch, Vivian A.
    Whiting, Penny
    Moher, David
    [J]. INTERNATIONAL JOURNAL OF SURGERY, 2021, 88
  • [3] An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure
    Ali, Liaqat
    Niamat, Awais
    Khan, Javed Ali
    Golilarz, Noorbakhsh Amiri
    Xiong Xingzhong
    Noor, Adeeb
    Nour, Redhwan
    Bukhari, Syed Ahmad Chan
    [J]. IEEE ACCESS, 2019, 7 : 54007 - 54014
  • [4] Ali R, 2019, PROC NAECON IEEE NAT, P311, DOI 10.1109/NAECON46414.2019.9058245
  • [5] Ensemble Feature Ranking for Cost-Based Non-Overlapping Groups: A Case Study of Chronic Kidney Disease Diagnosis in Developing Countries
    Ali, Syed Imran
    Bilal, Hafiz Syed Muhammad
    Hussain, Musarrat
    Hussain, Jamil
    Satti, Fahad Ahmed
    Hussain, Maqbool
    Park, Gwang Hoon
    Chung, Taechoong
    Lee, Sungyoung
    [J]. IEEE ACCESS, 2020, 8 (08): : 215623 - 215648
  • [6] Ensemble Learning Based on Hybrid Deep Learning Model for Heart Disease Early Prediction
    Almulihi, Ahmed
    Saleh, Hager
    Hussien, Ali Mohamed
    Mostafa, Sherif
    El-Sappagh, Shaker
    Alnowaiser, Khaled
    Ali, Abdelmgeid A.
    Refaat Hassan, Moatamad
    [J]. DIAGNOSTICS, 2022, 12 (12)
  • [7] Cardiovascular Disease Detection using Ensemble Learning
    Alqahtani, Abdullah
    Alsubai, Shtwai
    Sha, Mohemmed
    Vilcekova, Lucia
    Javed, Talha
    [J]. COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [8] [Anonymous], UCI DERM
  • [9] [Anonymous], UCI IND LIV PAT
  • [10] [Anonymous], UCI CHRON KIDN