Machine learning-based models for the prediction of breast cancer recurrence risk

被引:23
|
作者
Zuo, Duo [1 ,2 ,3 ,4 ,5 ]
Yang, Lexin [1 ,2 ,3 ,4 ,5 ]
Jin, Yu [1 ,6 ]
Qi, Huan [7 ]
Liu, Yahui [1 ,2 ,3 ,4 ,5 ]
Ren, Li [1 ,2 ,3 ,4 ,5 ]
机构
[1] Tianjin Med Univ, Dept Clin Lab, Canc Inst & Hosp, Tianjin 300060, Peoples R China
[2] Natl Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[3] Tianjins Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[4] Key Lab Canc Prevent & Therapy, Tianjin 300060, Peoples R China
[5] Tianjin Med Univ, Key Lab Breast Canc Prevent & Therapy, Minist Educ, Tianjin 300060, Peoples R China
[6] Tongji Univ, Canc Ctr, Shanghai Peoples Hosp 10, Sch Med, Shanghai 200072, Peoples R China
[7] China Mobile Grp Tianjin Co Ltd, Tianjin 300130, Peoples R China
关键词
Breast cancer; Machine learning; Artificial intelligence; Disease recurrence; Prediction model; PLASMA-FIBRINOGEN LEVEL; ARTIFICIAL-INTELLIGENCE; HEALTH-CARE; FOLLOW-UP; SURVIVAL; OVARIAN; CA125; CLASSIFICATION; PROGNOSIS; INDICATOR;
D O I
10.1186/s12911-023-02377-z
中图分类号
R-058 [];
学科分类号
摘要
Breast cancer is the most common malignancy diagnosed in women worldwide. The prevalence and incidence of breast cancer is increasing every year; therefore, early diagnosis along with suitable relapse detection is an important strategy for prognosis improvement. This study aimed to compare different machine algorithms to select the best model for predicting breast cancer recurrence. The prediction model was developed by using eleven different machine learning (ML) algorithms, including logistic regression (LR), random forest (RF), support vector classification (SVC), extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), decision tree, multilayer perceptron (MLP), linear discriminant analysis (LDA), adaptive boosting (AdaBoost), Gaussian naive Bayes (GaussianNB), and light gradient boosting machine (LightGBM), to predict breast cancer recurrence. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score were used to evaluate the performance of the prognostic model. Based on performance, the optimal ML was selected, and feature importance was ranked by Shapley Additive Explanation (SHAP) values. Compared to the other 10 algorithms, the results showed that the AdaBoost algorithm had the best prediction performance for successfully predicting breast cancer recurrence and was adopted in the establishment of the prediction model. Moreover, CA125, CEA, Fbg, and tumor diameter were found to be the most important features in our dataset to predict breast cancer recurrence. More importantly, our study is the first to use the SHAP method to improve the interpretability of clinicians to predict the recurrence model of breast cancer based on the AdaBoost algorithm. The AdaBoost algorithm offers a clinical decision support model and successfully identifies the recurrence of breast cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Exploring Machine Learning Models for Recurrence Prediction in Lung Cancer Patients
    Ramesh, Priyanka
    Jain, Anika
    Karuppasamy, Ramanathan
    Veerappapillai, Shanthi
    INDIAN JOURNAL OF PHARMACEUTICAL EDUCATION AND RESEARCH, 2022, 56 (03) : S398 - S406
  • [32] A machine learning-based diabetes risk prediction modeling study
    Ming, Jiexiu
    Xu, Junyi
    Zhang, Miaomiao
    Li, Ningyu
    Yan, Xu
    PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 363 - 369
  • [33] A machine learning-based universal outbreak risk prediction tool
    Zhang, Tianyu
    Rabhi, Fethi
    Chen, Xin
    Paik, Hye-young
    Macintyre, Chandini Raina
    COMPUTERS IN BIOLOGY AND MEDICINE, 2024, 169
  • [34] Machine Learning-Based Risk Prediction of Discharge Status for Sepsis
    Cai, Kaida
    Lou, Yuqing
    Wang, Zhengyan
    Yang, Xiaofang
    Zhao, Xin
    ENTROPY, 2024, 26 (08)
  • [35] Machine Learning-Based Aviation Meteorological Risk Prediction Model
    Miao, Shaohui
    Du, Jiaxing
    SPIN, 2025,
  • [36] Machine Learning-Based Models for Assessing Postoperative Risk Factors in Patients with Cervical Cancer
    Zhang, Yu
    Qin, Zhihui
    Li, Linrui
    Liu, Long
    Wu, Qibing
    ACADEMIC RADIOLOGY, 2024, 31 (04) : 1410 - 1418
  • [37] Deep Learning-Based Prediction Model for Breast Cancer Recurrence Using Adjuvant Breast Cancer Cohort in Tertiary Cancer Center Registry
    Kim, Ji-Yeon
    Lee, Yong Seok
    Yu, Jonghan
    Park, Youngmin
    Lee, Se Kyung
    Lee, Minyoung
    Lee, Jeong Eon
    Kim, Seok Won
    Nam, Seok Jin
    Park, Yeon Hee
    Ahn, Jin Seok
    Kang, Mira
    Im, Young-Hyuck
    FRONTIERS IN ONCOLOGY, 2021, 11
  • [38] mtPCDI: a machine learning-based prognostic model for prostate cancer recurrence
    Cheng, Guoliang
    Xu, Junrong
    Wang, Honghua
    Chen, Jingzhao
    Huang, Liwei
    Qian, Zhi Rong
    Fan, Yong
    FRONTIERS IN GENETICS, 2024, 15
  • [39] Machine learning based models for Cardiovascular risk prediction
    Rajliwall, Nitten S.
    Davey, Rachel
    Chetty, Girija
    2018 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND DATA ENGINEERING (ICMLDE 2018), 2018, : 142 - 148
  • [40] Machine learning-based construction site dynamic risk models
    Gondia, Ahmed
    Moussa, Ahmed
    Ezzeldin, Mohamed
    El-Dakhakhni, Wael
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2023, 189