Enhancing fairness in breast cancer recurrence prediction through temporal machine learning models

被引:0
|
作者
Sundus, Katrina I. [1 ]
Hammo, Bassam H. [1 ,2 ]
Al-Zoubi, Mohammad B. [1 ]
机构
[1] King Abdullah II School of Information Technology, The University of Jordan, Amman
[2] King Hussein School of Computing Sciences, Princess Sumaya University for Technology, Amman
关键词
Breast cancer recurrence; Ensemble learning; SMOTE; Temporal data; Under-sampling;
D O I
10.1007/s00521-024-10407-8
中图分类号
学科分类号
摘要
Breast cancer recurrence prediction is a significant challenge in oncology. Advanced methodologies are required to improve prediction accuracy and clinical decision-making. This study presents a novel approach to breast cancer recurrence prediction by integrating machine learning techniques and a hybrid data mining methodology incorporating a temporal dimension into dataset derivation. Our research is based on the Jordan Breast Cancer Dataset (JBRCA), which includes over 44,000 cases spanning 15 years collected from the King Hussein Cancer Center’s registry database in Amman, Jordan. The proposed methodology encompasses data understanding, preparation, and model development stages. We use a thorough data preparation process involving multicollinearity feature selection, feature scaling, and strategic sampling to address dataset challenges. Moreover, we introduce a temporal-derived dataset strategy, dividing the data into four distinct time intervals to capture evolving characteristics and optimize model relevance. We employ diverse base classifiers and ensemble methods to enhance predictive performance in model development. We use evaluation metrics such as accuracy, recall, specificity, G-mean, and ROC-AUC to assess model efficacy across temporal intervals. Our experimental findings reveal significant impacts on classifier performance with temporal dataset derivation, with notable strengths observed in specific classifiers and temporal intervals. For instance, the Naive Bayes model demonstrates efficacy in identifying recurrence cases, while logistic regression exhibits robust performance in ROC-AUC and G-mean metrics. Our study contributes to breast cancer recurrence prediction by introducing a novel methodology that addresses dataset challenges and leverages temporal insights for enhanced predictive accuracy. The findings have a direct impact on clinical practice, providing valuable tools for early detection and improved therapy planning. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.
引用
收藏
页码:22697 / 22718
页数:21
相关论文
共 50 条
  • [31] A priori prediction of tumour response to neoadjuvant chemotherapy in breast cancer patients using quantitative CT and machine learning
    Moghadas-Dastjerdi, Hadi
    Sha-E-Tallat, Hira Rahman
    Sannachi, Lakshmanan
    Sadeghi-Naini, Ali
    Czarnota, Gregory J.
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [32] Design Ensemble Machine Learning Model for Breast Cancer Diagnosis
    Hsieh, Sheau-Ling
    Hsieh, Sung-Huai
    Cheng, Po-Hsun
    Chen, Chi-Huang
    Hsu, Kai-Ping
    Lee, I-Shun
    Wang, Zhenyu
    Lai, Feipei
    JOURNAL OF MEDICAL SYSTEMS, 2012, 36 (05) : 2841 - 2847
  • [33] Enhancing prediction accuracy of concrete compressive strength using stacking ensemble machine learning
    Zhao, Yunpeng
    Goulias, Dimitrios
    Saremi, Setare
    COMPUTERS AND CONCRETE, 2023, 32 (03) : 233 - 246
  • [34] Design Ensemble Machine Learning Model for Breast Cancer Diagnosis
    Sheau-Ling Hsieh
    Sung-Huai Hsieh
    Po-Hsun Cheng
    Chi-Huang Chen
    Kai-Ping Hsu
    I-Shun Lee
    Zhenyu Wang
    Feipei Lai
    Journal of Medical Systems, 2012, 36 : 2841 - 2847
  • [35] Enhancing Early Breast Cancer Detection Through Advanced Data Analysis
    Rahman, Md. Atiqur
    Hamada, Mohamed
    Sharmin, Shayla
    Afroz Rimi, Tanzina
    Sanjida Talukder, Atia
    Imran, Nafees
    Kobra, Khadijatul
    Ridwan Ahmed, Md
    Rabbi, Md
    Hasan Matin, Md. Mafiul
    Ameer Ali, M.
    IEEE ACCESS, 2024, 12 : 161941 - 161953
  • [36] Enhancing Heart Disease Prediction through Ensemble Learning Techniques with Hyperparameter Optimization
    Asif, Daniyal
    Bibi, Mairaj
    Arif, Muhammad Shoaib
    Mukheimer, Aiman
    ALGORITHMS, 2023, 16 (06)
  • [37] Diversity based imbalance learning approach for software fault prediction using machine learning models
    Manchala, Pravali
    Bisi, Manjubala
    APPLIED SOFT COMPUTING, 2022, 124
  • [38] Multimodal adversarial representation learning for breast cancer prognosis prediction
    Du, Xiuquan
    Zhao, Yuefan
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 157
  • [39] A Comparative Analysis of Machine/Deep Learning Models for Parking Space Availability Prediction
    Awan, Faraz Malik
    Saleem, Yasir
    Minerva, Roberto
    Crespi, Noel
    SENSORS, 2020, 20 (01)
  • [40] Prediction of coronary heart disease in gout patients using machine learning models
    Jiang, Lili
    Chen, Sirong
    Wu, Yuanhui
    Zhou, Da
    Duan, Lihua
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (03) : 4574 - 4591