Machine Learning-Based Prognostic Prediction Models of Non-Metastatic Colon Cancer: Analyses Based on Surveillance, Epidemiology and End Results Database and a Chinese Cohort

被引:4
作者
Tang, Mo [1 ]
Gao, Lihao [2 ]
He, Bin [1 ]
Yang, Yufei [1 ]
机构
[1] China Acad Chinese Med Sci, Oncol Dept, Xiyuan Hosp, Beijing, Peoples R China
[2] Baidu Inc, Smart City Business Unit, 51 Dezhen Rd, Beijing 100091, Peoples R China
来源
CANCER MANAGEMENT AND RESEARCH | 2022年 / 14卷
关键词
colon cancer; machine learning; extreme gradient boosting; prognostic; ARTIFICIAL-INTELLIGENCE; COLORECTAL-CANCER; SURVIVAL; REGRESSION; CLASSIFICATION; OUTCOMES; TOOL;
D O I
10.2147/CMAR.S340739
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Purpose: The present study aimed to develop prognostic prediction models based on machine learning (ML) for non-metastatic colon cancer (CRC), which can provide a precise quantitative risk assessment and serve as an assistive method for treatment strategy development. The possibility of improving prediction accuracy using nonlinear methods compared to linear methods was investigated. Patients and Methods: A cancer-specific survival (CSS) model constructed using logistic regression, extreme gradient boosting (XGBoost), and random forest algorithms was trained on the Surveillance, Epidemiology, and End Results datasets for 15,254 patients with nonmetastatic CRC (split into training [70%] and internal validation [30%] datasets) and externally validated with an outpatient cohort of 311 cases from Xiyuan Hospital in China. A Chinese cohort was also used to develop recurrence and metastasis (R&M) models for CRC patients. The experiments for each model were performed 100 times to obtain average scores and 95% confidence intervals. The model performance was evaluated using the area under the receiver operating characteristic curve (AUC) values. Results: The XGBoost approach showed the highest AUC values of 0.86 (0.84-0.88), 0.82 (0.81-0.83), and 0.81 (0.79-0.82) for one-, three-, and five-year CSS cohorts, respectively, along with a relatively high generalization ability. The XGBoost approach also performed best for the R&M model, with the AUC values of 0.71 (0.64-0.79), 0.79 (0.74-0.86), and 0.89 (0.82-0.95) for one-, three-, and five-year R&M cohorts, respectively. The rankings of predictor importance for the CSS and R&M models were different, and the higher model accuracy was associated with more prognostic predictors. Conclusion: Three different ML algorithms for developing prognostic prediction models for non-metastatic CRC were compared. The predictive performance results showed that the nonlinear XGBoost approach performed best, suggesting that it can be used for quantifying the prognostic risk. It was also demonstrated that the model performance can be improved when more prognostic predictors are considered.
引用
收藏
页码:25 / 35
页数:11
相关论文
共 50 条
  • [11] Prognostic prediction models for postoperative patients with stage I to III colorectal cancer based on machine learning
    Ji, Xiao-Lin
    Xu, Shuo
    Li, Xiao-Yu
    Xu, Jin-Huan
    Han, Rong-Shuang
    Guo, Ying-Jie
    Duan, Li-Ping
    Tian, Zi-Bin
    WORLD JOURNAL OF GASTROINTESTINAL ONCOLOGY, 2024, 16 (12)
  • [12] Machine learning-based models for the prediction of breast cancer recurrence risk
    Duo Zuo
    Lexin Yang
    Yu Jin
    Huan Qi
    Yahui Liu
    Li Ren
    BMC Medical Informatics and Decision Making, 23
  • [13] Evaluation of machine learning algorithms for the prognosis of breast cancer from the Surveillance, Epidemiology, and End Results database
    Wu, Ruiyang
    Luo, Jing
    Wan, Hangyu
    Zhang, Haiyan
    Yuan, Yewei
    Hu, Huihua
    Feng, Jinyan
    Wen, Jing
    Wang, Yan
    Li, Junyan
    Liang, Qi
    Gan, Fengjiao
    Zhang, Gang
    PLOS ONE, 2023, 18 (01):
  • [14] Machine Learning-Based Models Enhance the Prediction of Prostate Cancer
    Chen, Sunmeng
    Jian, Tengteng
    Chi, Changliang
    Liang, Yi
    Liang, Xiao
    Yu, Ying
    Jiang, Fengming
    Lu, Ji
    FRONTIERS IN ONCOLOGY, 2022, 12
  • [15] Development of a nomogram predicting perineural invasion risk and assessment of the prognostic value of perineural invasion in colon cancer: a population study based on the Surveillance, Epidemiology, and End Results database
    Zheng, Zhongqiang
    Sun, Xuanzi
    TRANSLATIONAL CANCER RESEARCH, 2025, 14 (01) : 141 - 158
  • [16] Prognostic nomogram in patients with right-sided colon cancer after colectomy: a surveillance, epidemiology, and end results-based study
    Qin, Tiantian
    Yu, Chenyue
    Dong, Yuying
    Zheng, Mingming
    Wang, Xiaoxuan
    Shen, Xuning
    FRONTIERS IN ONCOLOGY, 2024, 14
  • [17] Development and validation of a nomogram containing the prognostic determinants of chondrosarcoma based on the Surveillance, Epidemiology, and End Results database
    Jun Zhang
    Zhenyu Pan
    Fanfan Zhao
    Xiaojie Feng
    Yuanchi Huang
    Chuanyu Hu
    Yuanjie Li
    Jun Lyu
    International Journal of Clinical Oncology, 2019, 24 : 1459 - 1467
  • [18] Incidence, Prognostic Factors and Survival for Hemangioblastoma of the Central Nervous System: Analysis Based on the Surveillance, Epidemiology, and End Results Database
    Yin, Xiangdong
    Duan, Hongzhou
    Yi, Zhiqiang
    Li, Chunwei
    Lu, Runchun
    Li, Liang
    FRONTIERS IN ONCOLOGY, 2020, 10
  • [19] Development of a Novel Prognostic Model for Predicting Lymph Node Metastasis in Early Colorectal Cancer: Analysis Based on the Surveillance, Epidemiology, and End Results Database
    Ahn, Ji Hyun
    Kwak, Min Seob
    Lee, Hun Hee
    Cha, Jae Myung
    Shin, Hyun Phil
    Jeon, Jung Won
    Yoon, Jin Young
    FRONTIERS IN ONCOLOGY, 2021, 11
  • [20] Development and validation of a nomogram containing the prognostic determinants of chondrosarcoma based on the Surveillance, Epidemiology, and End Results database
    Zhang, Jun
    Pan, Zhenyu
    Zhao, Fanfan
    Feng, Xiaojie
    Huang, Yuanchi
    Hu, Chuanyu
    Li, Yuanjie
    Lyu, Jun
    INTERNATIONAL JOURNAL OF CLINICAL ONCOLOGY, 2019, 24 (11) : 1459 - 1467