Machine learning-based models for the prediction of breast cancer recurrence risk

被引:23
|
作者
Zuo, Duo [1 ,2 ,3 ,4 ,5 ]
Yang, Lexin [1 ,2 ,3 ,4 ,5 ]
Jin, Yu [1 ,6 ]
Qi, Huan [7 ]
Liu, Yahui [1 ,2 ,3 ,4 ,5 ]
Ren, Li [1 ,2 ,3 ,4 ,5 ]
机构
[1] Tianjin Med Univ, Dept Clin Lab, Canc Inst & Hosp, Tianjin 300060, Peoples R China
[2] Natl Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[3] Tianjins Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[4] Key Lab Canc Prevent & Therapy, Tianjin 300060, Peoples R China
[5] Tianjin Med Univ, Key Lab Breast Canc Prevent & Therapy, Minist Educ, Tianjin 300060, Peoples R China
[6] Tongji Univ, Canc Ctr, Shanghai Peoples Hosp 10, Sch Med, Shanghai 200072, Peoples R China
[7] China Mobile Grp Tianjin Co Ltd, Tianjin 300130, Peoples R China
关键词
Breast cancer; Machine learning; Artificial intelligence; Disease recurrence; Prediction model; PLASMA-FIBRINOGEN LEVEL; ARTIFICIAL-INTELLIGENCE; HEALTH-CARE; FOLLOW-UP; SURVIVAL; OVARIAN; CA125; CLASSIFICATION; PROGNOSIS; INDICATOR;
D O I
10.1186/s12911-023-02377-z
中图分类号
R-058 [];
学科分类号
摘要
Breast cancer is the most common malignancy diagnosed in women worldwide. The prevalence and incidence of breast cancer is increasing every year; therefore, early diagnosis along with suitable relapse detection is an important strategy for prognosis improvement. This study aimed to compare different machine algorithms to select the best model for predicting breast cancer recurrence. The prediction model was developed by using eleven different machine learning (ML) algorithms, including logistic regression (LR), random forest (RF), support vector classification (SVC), extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), decision tree, multilayer perceptron (MLP), linear discriminant analysis (LDA), adaptive boosting (AdaBoost), Gaussian naive Bayes (GaussianNB), and light gradient boosting machine (LightGBM), to predict breast cancer recurrence. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score were used to evaluate the performance of the prognostic model. Based on performance, the optimal ML was selected, and feature importance was ranked by Shapley Additive Explanation (SHAP) values. Compared to the other 10 algorithms, the results showed that the AdaBoost algorithm had the best prediction performance for successfully predicting breast cancer recurrence and was adopted in the establishment of the prediction model. Moreover, CA125, CEA, Fbg, and tumor diameter were found to be the most important features in our dataset to predict breast cancer recurrence. More importantly, our study is the first to use the SHAP method to improve the interpretability of clinicians to predict the recurrence model of breast cancer based on the AdaBoost algorithm. The AdaBoost algorithm offers a clinical decision support model and successfully identifies the recurrence of breast cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Lifestyle and occupational risks assessment of bladder cancer using machine learning-based prediction models
    Shakhssalim, Naser
    Talebi, Atefeh
    Pahlevan-Fallahy, Mohammad-Taha
    Sotoodeh, Kasra
    Alavimajd, Hamid
    Borumandnia, Nasrin
    Taheri, Maryam
    CANCER REPORTS, 2023, 6 (09)
  • [42] Machine learning-based prediction of survival prognosis in cervical cancer
    Ding, Dongyan
    Lang, Tingyuan
    Zou, Dongling
    Tan, Jiawei
    Chen, Jia
    Zhou, Lei
    Wang, Dong
    Li, Rong
    Li, Yunzhe
    Liu, Jingshu
    Ma, Cui
    Zhou, Qi
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [43] Machine learning-based prediction of survival prognosis in cervical cancer
    Dongyan Ding
    Tingyuan Lang
    Dongling Zou
    Jiawei Tan
    Jia Chen
    Lei Zhou
    Dong Wang
    Rong Li
    Yunzhe Li
    Jingshu Liu
    Cui Ma
    Qi Zhou
    BMC Bioinformatics, 22
  • [44] Breast Cancer Prediction Using Soft Voting Classifier Based on Machine Learning Models
    Hashim, Mohammed S.
    Yassin, Ali A.
    IAENG International Journal of Computer Science, 2023, 50 (02)
  • [45] Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models
    Chang Ming
    Valeria Viassolo
    Nicole Probst-Hensch
    Pierre O. Chappuis
    Ivo D. Dinov
    Maria C. Katapodi
    Breast Cancer Research, 21
  • [46] Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models
    Ming, Chang
    Viassolo, Valeria
    Probst-Hensch, Nicole
    Chappuis, Pierre O.
    Dinov, Ivo D.
    Katapodi, Maria C.
    BREAST CANCER RESEARCH, 2019, 21 (1)
  • [47] Machine Learning-Based Models for Accident Prediction at a Korean Container Port
    Kim, Jae Hun
    Kim, Juyeon
    Lee, Gunwoo
    Park, Juneyoung
    SUSTAINABILITY, 2021, 13 (16)
  • [48] Machine Learning-based traffic prediction models for Intelligent Transportation Systems
    Boukerche, Azzedine
    Wang, Jiahao
    COMPUTER NETWORKS, 2020, 181
  • [49] Machine Learning-Based Prediction Models for Control Traffic in SDN Systems
    Yoo, Yeonho
    Yang, Gyeongsik
    Shin, Changyong
    Lee, Junseok
    Yoo, Chuck
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2023, 16 (06) : 4389 - 4403
  • [50] A Machine Learning-based Framework for Building Application Failure Prediction Models
    Pellegrini, Alessandro
    Di Sanzo, Pierangelo
    Avresky, Dimiter R.
    2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, 2015, : 1072 - 1081