Machine learning-based models for the prediction of breast cancer recurrence risk

被引:25
|
作者
Zuo, Duo [1 ,2 ,3 ,4 ,5 ]
Yang, Lexin [1 ,2 ,3 ,4 ,5 ]
Jin, Yu [1 ,6 ]
Qi, Huan [7 ]
Liu, Yahui [1 ,2 ,3 ,4 ,5 ]
Ren, Li [1 ,2 ,3 ,4 ,5 ]
机构
[1] Tianjin Med Univ, Dept Clin Lab, Canc Inst & Hosp, Tianjin 300060, Peoples R China
[2] Natl Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[3] Tianjins Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[4] Key Lab Canc Prevent & Therapy, Tianjin 300060, Peoples R China
[5] Tianjin Med Univ, Key Lab Breast Canc Prevent & Therapy, Minist Educ, Tianjin 300060, Peoples R China
[6] Tongji Univ, Canc Ctr, Shanghai Peoples Hosp 10, Sch Med, Shanghai 200072, Peoples R China
[7] China Mobile Grp Tianjin Co Ltd, Tianjin 300130, Peoples R China
关键词
Breast cancer; Machine learning; Artificial intelligence; Disease recurrence; Prediction model; PLASMA-FIBRINOGEN LEVEL; ARTIFICIAL-INTELLIGENCE; HEALTH-CARE; FOLLOW-UP; SURVIVAL; OVARIAN; CA125; CLASSIFICATION; PROGNOSIS; INDICATOR;
D O I
10.1186/s12911-023-02377-z
中图分类号
R-058 [];
学科分类号
摘要
Breast cancer is the most common malignancy diagnosed in women worldwide. The prevalence and incidence of breast cancer is increasing every year; therefore, early diagnosis along with suitable relapse detection is an important strategy for prognosis improvement. This study aimed to compare different machine algorithms to select the best model for predicting breast cancer recurrence. The prediction model was developed by using eleven different machine learning (ML) algorithms, including logistic regression (LR), random forest (RF), support vector classification (SVC), extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), decision tree, multilayer perceptron (MLP), linear discriminant analysis (LDA), adaptive boosting (AdaBoost), Gaussian naive Bayes (GaussianNB), and light gradient boosting machine (LightGBM), to predict breast cancer recurrence. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score were used to evaluate the performance of the prognostic model. Based on performance, the optimal ML was selected, and feature importance was ranked by Shapley Additive Explanation (SHAP) values. Compared to the other 10 algorithms, the results showed that the AdaBoost algorithm had the best prediction performance for successfully predicting breast cancer recurrence and was adopted in the establishment of the prediction model. Moreover, CA125, CEA, Fbg, and tumor diameter were found to be the most important features in our dataset to predict breast cancer recurrence. More importantly, our study is the first to use the SHAP method to improve the interpretability of clinicians to predict the recurrence model of breast cancer based on the AdaBoost algorithm. The AdaBoost algorithm offers a clinical decision support model and successfully identifies the recurrence of breast cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Lung Cancer Risk Prediction with Machine Learning Models
    Dritsas, Elias
    Trigka, Maria
    BIG DATA AND COGNITIVE COMPUTING, 2022, 6 (04)
  • [32] Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis
    Rasool, Abdur
    Bunterngchit, Chayut
    Tiejian, Luo
    Islam, Md Ruhul
    Qu, Qiang
    Jiang, Qingshan
    INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2022, 19 (06)
  • [33] Machine Learning-Based Short-Term Mortality Prediction Models for Patients With Cancer Using Electronic Health Record Data: Systematic Review and Critical Appraisal
    Lu, Sheng-Chieh
    Xu, Cai
    Nguyen, Chandler H.
    Geng, Yimin
    Pfob, Andre
    Sidey-Gibbons, Chris
    JMIR MEDICAL INFORMATICS, 2022, 10 (03)
  • [34] Prediction of Long-Term Stroke Recurrence Using Machine Learning Models
    Abedi, Vida
    Avula, Venkatesh
    Chaudhary, Durgesh
    Shahjouei, Shima
    Khan, Ayesha
    Griessenauer, Christoph J.
    Li, Jiang
    Zand, Ramin
    JOURNAL OF CLINICAL MEDICINE, 2021, 10 (06) : 1 - 16
  • [35] Predicting the Recurrence of Ovarian Cancer Based on Machine Learning
    Zhou, Lining
    Hong, Hong
    Chu, Fuying
    Chen, Xiang
    Wang, Chenlu
    CANCER MANAGEMENT AND RESEARCH, 2024, 16 : 1375 - 1387
  • [36] Osteoporosis, fracture and survival: Application of machine learning in breast cancer prediction models
    Ji, Lichen
    Zhang, Wei
    Zhong, Xugang
    Zhao, Tingxiao
    Sun, Xixi
    Zhu, Senbo
    Tong, Yu
    Luo, Junchao
    Xu, Youjia
    Yang, Di
    Kang, Yao
    Wang, Jin
    Bi, Qing
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [37] Pre-existing and machine learning-based models for cardiovascular risk prediction
    Cho, Sang-Yeong
    Kim, Sun-Hwa
    Kang, Si-Hyuck
    Lee, Kyong Joon
    Choi, Dongjun
    Kang, Seungjin
    Park, Sang Jun
    Kim, Tackeun
    Yoon, Chang-Hwan
    Youn, Tae-Jin
    Chae, In-Ho
    SCIENTIFIC REPORTS, 2021, 11 (01) : 8886
  • [38] Predictive value of machine learning for breast cancer recurrence: a systematic review and meta-analysis
    Lu, Dongmei
    Long, Xiaozhou
    Fu, Wenjie
    Liu, Bo
    Zhou, Xing
    Sun, Shaoqin
    JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2023, 149 (12) : 10659 - 10674
  • [39] Machine Learning Prediction of Early Recurrence in Gastric Cancer: A Nationwide Real-World Study
    Zhang, Xing-Qi
    Huang, Ze-Ning
    Wu, Ju
    Liu, Xiao-Dong
    Xie, Rong-Zhen
    Wu, Ying-Xin
    Zheng, Chang-Yue
    Zheng, Chao-Hui
    Li, Ping
    Xie, Jian-Wei
    Wang, Jia-Bin
    He, Qi-Chen
    Qiu, Wen-Wu
    Tang, Yi-Hui
    Zhang, Hao-Xiang
    Zhou, Yan-Bing
    Lin, Jian-Xian
    Huang, Chang-Ming
    ANNALS OF SURGICAL ONCOLOGY, 2025, 32 (04) : 2637 - 2650
  • [40] Machine learning-based construction site dynamic risk models
    Gondia, Ahmed
    Moussa, Ahmed
    Ezzeldin, Mohamed
    El-Dakhakhni, Wael
    TECHNOLOGICAL FORECASTING AND SOCIAL CHANGE, 2023, 189