Comparative analysis of machine learning and ensemble approaches for hepatitis B prediction using data mining with synthetic minority oversampling technique

被引:0
作者
Alizargar, Azadeh [1 ]
Chang, Yang-Lang [1 ]
Tan, Tan-Hsu [1 ]
Liu, Tsung-Yu [2 ]
机构
[1] Natl Taipei Univ Technol, Coll Elect Engn & Comp Sci, Dept Elect Engn, Taipei 10608, Taiwan
[2] Lunghwa Univ Sci & Technol, Dept Multimedia & Game Sci, Taoyuan 333326, Taiwan
关键词
Index terms- Hepatitis B; Liver damage; Early detection; Machine learning; Ensemble model; SMOTE; RISK; DIAGNOSIS; VIRUS;
D O I
10.1007/s12553-023-00802-x
中图分类号
R-058 [];
学科分类号
摘要
PurposeHepatitis B, caused by the Hepatitis B virus (HBV), can harm the liver without noticeable symptoms. Early detection is crucial to prevent transmission and enhance recovery. The main goal is to predict Hepatitis B through cost-effective lab test data, by utilizing machine learning. The primary focus is on evaluating the effectiveness of various algorithms in predicting the disease and their potential to enhance early diagnosis capabilities.MethodsSix distinct algorithms (Support Vector Machine, K-nearest Neighbors, Logistic Regression, decision tree, extreme gradient boosting, random forest) were employed alongside an ensemble model. Analysis involved two rounds: considering all features and key attributes. The Synthetic Minority Oversampling Technique (SMOTE) was employed for data imbalance. Various metrics, including the confusion matrix, precision, recall, F1 score, accuracy, receiver operating characteristics (ROC) curve, area under the curve (AUC), and mean absolute error (MAE), were utilized to assess the efficacy of each predictive technique. The National Health and Nutrition Examination Survey (NHANES) dataset was employed.ResultsThe experimental results demonstrate that the ensemble model attained the highest accuracy (97%) and AUC (0.997) in comparison to existing models. The analysis revealed that specific crucial features possess substantial predictive significance within this model.ConclusionThe study underscores the potential of the ensemble model as a valuable tool for medical practitioners, leveraging cost-effective and readily obtainable laboratory test data to predict Hepatitis B with remarkable accuracy. By facilitating early diagnosis and intervention, this research presents a promising avenue to enhance patient outcomes in the context of Hepatitis B.
引用
收藏
页码:109 / 118
页数:10
相关论文
共 50 条
  • [21] Hepatitis C virus data analysis and prediction using machine learning
    Yaganoglu, Mete
    DATA & KNOWLEDGE ENGINEERING, 2022, 142
  • [22] Prediction of COVID-19 disease severity using synthetic data oversampling and machine learning methods on data at first hospitalization
    Koksal, Kubra
    Dogan, Buket
    Altikardes, Zehra Aysun
    JOURNAL OF THE FACULTY OF ENGINEERING AND ARCHITECTURE OF GAZI UNIVERSITY, 2025, 40 (01): : 413 - 427
  • [23] Estimating Accident Reduction Rate after Maritime Traffic Safety Assessment Using Synthetic Minority Oversampling Technique and Machine Learning Algorithm
    Won, Wolseok
    Lim, Minjeong
    Kang, Wonsik
    APPLIED SCIENCES-BASEL, 2024, 14 (08):
  • [24] Acute ischemic stroke identification using mean and reorder resample, synthetic minority oversampling technique and linear discriminant analysis
    Reeja, J. Jackulin
    Arun, C. H.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 11785 - 11803
  • [25] Acute ischemic stroke identification using mean and reorder resample, synthetic minority oversampling technique and linear discriminant analysis
    J Jackulin Reeja
    C H Arun
    Multimedia Tools and Applications, 2024, 83 : 11785 - 11803
  • [26] Parkinson's Disease Data Analysis and Prediction Using Ensemble Machine Learning Techniques
    Mali, Rubash
    Sipai, Sushila
    Mali, Drish
    Shakya, Subarna
    MOBILE COMPUTING AND SUSTAINABLE INFORMATICS, 2022, 68 : 327 - 339
  • [27] Estimation of Peanut Southern Blight Severity in Hyperspectral Data Using the Synthetic Minority Oversampling Technique and Fractional-Order Differentiation
    Sun, Heguang
    Zhou, Lin
    Shu, Meiyan
    Zhang, Jie
    Feng, Ziheng
    Feng, Haikuan
    Song, Xiaoyu
    Yue, Jibo
    Guo, Wei
    AGRICULTURE-BASEL, 2024, 14 (03):
  • [28] Predicting Nurse Turnover for Highly Imbalanced Data Using the Synthetic Minority Over-Sampling Technique and Machine Learning Algorithms
    Xu, Yuan
    Park, Yongshin
    Park, Ju Dong
    Sun, Bora
    HEALTHCARE, 2023, 11 (24)
  • [29] A comparative ensemble approach to bedload prediction using metaheuristic machine learning
    Mir, Ajaz Ahmad
    Patel, Mahesh
    Albalawi, Fahad
    Bajaj, Mohit
    Tuka, Milkias Berhanu
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [30] Cervical Cancer Identification with Synthetic Minority Oversampling Technique and PCA Analysis using Random Forest Classifier
    R. Geetha
    S. Sivasubramanian
    M. Kaliappan
    S. Vimal
    Suresh Annamalai
    Journal of Medical Systems, 2019, 43