Machine learning-based models for the prediction of breast cancer recurrence risk

被引:25
|
作者
Zuo, Duo [1 ,2 ,3 ,4 ,5 ]
Yang, Lexin [1 ,2 ,3 ,4 ,5 ]
Jin, Yu [1 ,6 ]
Qi, Huan [7 ]
Liu, Yahui [1 ,2 ,3 ,4 ,5 ]
Ren, Li [1 ,2 ,3 ,4 ,5 ]
机构
[1] Tianjin Med Univ, Dept Clin Lab, Canc Inst & Hosp, Tianjin 300060, Peoples R China
[2] Natl Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[3] Tianjins Clin Res Ctr Canc, Tianjin 300060, Peoples R China
[4] Key Lab Canc Prevent & Therapy, Tianjin 300060, Peoples R China
[5] Tianjin Med Univ, Key Lab Breast Canc Prevent & Therapy, Minist Educ, Tianjin 300060, Peoples R China
[6] Tongji Univ, Canc Ctr, Shanghai Peoples Hosp 10, Sch Med, Shanghai 200072, Peoples R China
[7] China Mobile Grp Tianjin Co Ltd, Tianjin 300130, Peoples R China
关键词
Breast cancer; Machine learning; Artificial intelligence; Disease recurrence; Prediction model; PLASMA-FIBRINOGEN LEVEL; ARTIFICIAL-INTELLIGENCE; HEALTH-CARE; FOLLOW-UP; SURVIVAL; OVARIAN; CA125; CLASSIFICATION; PROGNOSIS; INDICATOR;
D O I
10.1186/s12911-023-02377-z
中图分类号
R-058 [];
学科分类号
摘要
Breast cancer is the most common malignancy diagnosed in women worldwide. The prevalence and incidence of breast cancer is increasing every year; therefore, early diagnosis along with suitable relapse detection is an important strategy for prognosis improvement. This study aimed to compare different machine algorithms to select the best model for predicting breast cancer recurrence. The prediction model was developed by using eleven different machine learning (ML) algorithms, including logistic regression (LR), random forest (RF), support vector classification (SVC), extreme gradient boosting (XGBoost), gradient boosting decision tree (GBDT), decision tree, multilayer perceptron (MLP), linear discriminant analysis (LDA), adaptive boosting (AdaBoost), Gaussian naive Bayes (GaussianNB), and light gradient boosting machine (LightGBM), to predict breast cancer recurrence. The area under the curve (AUC), accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and F1 score were used to evaluate the performance of the prognostic model. Based on performance, the optimal ML was selected, and feature importance was ranked by Shapley Additive Explanation (SHAP) values. Compared to the other 10 algorithms, the results showed that the AdaBoost algorithm had the best prediction performance for successfully predicting breast cancer recurrence and was adopted in the establishment of the prediction model. Moreover, CA125, CEA, Fbg, and tumor diameter were found to be the most important features in our dataset to predict breast cancer recurrence. More importantly, our study is the first to use the SHAP method to improve the interpretability of clinicians to predict the recurrence model of breast cancer based on the AdaBoost algorithm. The AdaBoost algorithm offers a clinical decision support model and successfully identifies the recurrence of breast cancer.
引用
收藏
页数:14
相关论文
共 50 条
  • [41] Machine Learning Techniques for Survival Time Prediction in Breast Cancer
    Mihaylov, Iliyan
    Nisheva, Maria
    Vassilev, Dimitar
    ARTIFICIAL INTELLIGENCE: METHODOLOGY, SYSTEMS, AND APPLICATIONS, AIMSA 2018, 2018, 11089 : 186 - 194
  • [42] CATEGORY BOOSTING MACHINE LEARNING ALGORITHM FOR BREAST CANCER PREDICTION
    Gupta, Harshit
    Kumar, Pritam
    Saurabh, Shubham
    Mishra, Sunil Kumar
    Appasani, Bhargav
    Pati, Avadh
    Ravariu, Cristian
    Srinivasulu, Avireni
    REVUE ROUMAINE DES SCIENCES TECHNIQUES-SERIE ELECTROTECHNIQUE ET ENERGETIQUE, 2021, 66 (03): : 201 - 206
  • [43] Machine learning-based models for the concrete breakout capacity prediction of single anchors in shear
    Olalusi, Oladimeji B.
    Spyridis, Panagiotis
    ADVANCES IN ENGINEERING SOFTWARE, 2020, 147
  • [44] Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models
    Ming, Chang
    Viassolo, Valeria
    Probst-Hensch, Nicole
    Chappuis, Pierre O.
    Dinov, Ivo D.
    Katapodi, Maria C.
    BREAST CANCER RESEARCH, 2019, 21 (1)
  • [45] Machine learning techniques for personalized breast cancer risk prediction: comparison with the BCRAT and BOADICEA models
    Chang Ming
    Valeria Viassolo
    Nicole Probst-Hensch
    Pierre O. Chappuis
    Ivo D. Dinov
    Maria C. Katapodi
    Breast Cancer Research, 21
  • [46] Recent advancements in machine learning and deep learning-based breast cancer detection using mammograms
    Sahu, Adyasha
    Das, Pradeep Kumar
    Meher, Sukadev
    PHYSICA MEDICA-EUROPEAN JOURNAL OF MEDICAL PHYSICS, 2023, 114
  • [47] Machine learning-based prognostic and metastasis models of kidney cancer
    Zhang, Yuxiang
    Hong, Na
    Huang, Sida
    Wu, Jie
    Gao, Jianwei
    Xu, Zheng
    Zhang, Fubo
    Ma, Shaohui
    Liu, Ye
    Sun, Peiyuan
    Tang, Yanping
    Liu, Chun
    Shou, Jianzhong
    Chen, Meng
    CANCER INNOVATION, 2022, 1 (02): : 124 - 134
  • [48] Prediction of Early Distant Recurrence in Upfront Resectable Pancreatic Adenocarcinoma: A Multidisciplinary, Machine Learning-Based Approach
    Palumbo, Diego
    Mori, Martina
    Prato, Francesco
    Crippa, Stefano
    Belfiori, Giulio
    Reni, Michele
    Mushtaq, Junaid
    Aleotti, Francesca
    Guazzarotti, Giorgia
    Cao, Roberta
    Steidler, Stephanie
    Tamburrino, Domenico
    Spezi, Emiliano
    Del Vecchio, Antonella
    Cascinu, Stefano
    Falconi, Massimo
    Fiorino, Claudio
    De Cobelli, Francesco
    CANCERS, 2021, 13 (19)
  • [49] Prediction models applying machine learning to oral cavity cancer outcomes: A systematic review
    Adeoye, John
    Tan, Jia Yan
    Choi, Siu-Wai
    Thomson, Peter
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 154
  • [50] Machine Learning-based Software Quality Prediction Models: State of the Art
    Al-Jamimi, Hamdi A.
    Ahmed, Moataz
    2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND APPLICATIONS (ICISA 2013), 2013,