Explainable Machine Learning Models for Colorectal Cancer Prediction Using Clinical Laboratory Data

被引:1
作者
Li, Rui [1 ]
Hao, Xiaoyan [1 ]
Diao, Yanjun [1 ]
Yang, Liu [1 ]
Liu, Jiayun [1 ]
机构
[1] Air Force Med Univ, Xijing Hosp, Dept Clin Lab Med, 127 Changle West Rd, Xian 710032, Peoples R China
关键词
colorectal cancer; machine learning; clinical laboratory data; risk prediction; miR-92a;
D O I
10.1177/10732748251336417
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
IntroductionEarly diagnosis of colorectal cancer (CRC) poses a significant clinical challenge. This study aims to develop machine learning (ML) models for CRC risk prediction using clinical laboratory data. MethodsThis retrospective, single-center study analyzed laboratory examination data from healthy controls (HC), polyp patients (Polyp), and CRC patients between 2013 and 2023. Five ML algorithms, including adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), decision tree (DT), logistic regression (LR), and random forest (RF), were employed to classify subjects into HC vs Polyp vs CRC, HC vs CRC, and Polyp vs CRC, respectively. ResultsThis study included 31 539 subjects: 11 793 HCs, 10 125 polyp patients, and 9621 CRC patients. The XGBoost model achieved the highest AUCs of 0.966 for differentiating HC from CRC and 0.881 for Polyp from CRC, outperforming carcino-embryonic antigen (CEA) and fecal occult blood testing (FOBT) tests. This model could also identify CEA-negative or FOBT-negative CRC patients. Incorporating stool miR-92a detection into the model further improved diagnostic performance. Shapley additive explanations (SHAP) plots indicated that FOBT, CEA, lymphocyte percentage (LYMPH%), and hematocrit (HCT) were the most significant features contributing to CRC diagnosis. Additionally, a computational tool for predicting CRC risk based on the optimal model was developed, designed for researchers with programming experience. ConclusionFive ML models for CRC diagnosis, based on ten routine laboratory test items, were developed, achieving higher diagnostic accuracies than traditional CRC biomarkers. The diagnostic capabilities of these ML models can be further enhanced by including stool miR-92a levels.
引用
收藏
页数:14
相关论文
共 46 条
[1]   Proteomics for early detection of colorectal cancer: recent updates [J].
Alnabulsi, Abdo ;
Murray, Graeme I. .
EXPERT REVIEW OF PROTEOMICS, 2018, 15 (01) :55-63
[2]   Associations of Complete Blood Count Parameters with Disease-Free Survival in Right- and Left-Sided Colorectal Cancer Patients [J].
Alsalman, Alhasan ;
Al-Mterin, Mohammad A. ;
Abu-Dayeh, Ala ;
Alloush, Ferial ;
Murshed, Khaled ;
Elkord, Eyad .
JOURNAL OF PERSONALIZED MEDICINE, 2022, 12 (05)
[3]   The Spectrum of Malnutrition/Cachexia/Sarcopenia in Oncology According to Different Cancer Types and Settings: A Narrative Review [J].
Bossi, Paolo ;
Delrio, Paolo ;
Mascheroni, Annalisa ;
Zanetti, Michela .
NUTRIENTS, 2021, 13 (06)
[4]   Estimated Lifetime Gained With Cancer Screening Tests [J].
Bretthauer, Michael ;
Wieszczy, Paulina ;
Loberg, Magnus ;
Kaminski, Michal F. ;
Werner, Tarjei Fiskergard ;
Helsingen, Lise M. ;
Mori, Yuichi ;
Holme, Oyvind ;
Adami, Hans-Olov ;
Kalager, Mette .
JAMA INTERNAL MEDICINE, 2023, 183 (11) :1196-1203
[5]   Carcinoembryonic antigen (CEA) and hepatic metastasis in colorectal cancer: Update on biomarker for clinical and biotechnological approaches [J].
Campos-Da-paz M. ;
Dórea J.G. ;
Galdino A.S. ;
Lacava Z.G.M. ;
Almeida Santos M.F.M. .
Recent Patents on Biotechnology, 2018, 12 (04) :269-279
[6]   Serrated polyposis syndrome; epidemiology and management [J].
Carballal, S. ;
Balaguer, F. ;
IJspeert, J. E. G. .
BEST PRACTICE & RESEARCH CLINICAL GASTROENTEROLOGY, 2022, 58-59
[7]   Integrated analysis of the faecal metagenome and serum metabolome reveals the role of gut microbiome-associated metabolites in the detection of colorectal cancer and adenoma [J].
Chen, Feng ;
Dai, Xudong ;
Zhou, Chang-Chun ;
Li, Ke-Xin ;
Zhang, Yu-Juan ;
Lou, Xiao-Ying ;
Zhu, Yuan-Min ;
Sun, Yan-Lai ;
Peng, Bao-Xiang ;
Cui, Wei .
GUT, 2022, 71 (07) :1315-1325
[8]   Stool-Based miR-92a and miR-144*as Noninvasive Biomarkers for Colorectal Cancer Screening [J].
Choi, Hyun Ho ;
Cho, Young-Seok ;
Choi, Ji Hye ;
Kim, Hyung-Keun ;
Kim, Sung Soo ;
Chae, Hiun-Suk .
ONCOLOGY, 2019, 97 (03) :173-179
[9]   Machine Learning in Medicine [J].
Deo, Rahul C. .
CIRCULATION, 2015, 132 (20) :1920-1930
[10]   Machine learning-based clinical decision support systems for pregnancy care: A systematic review [J].
Du, Yuhan ;
McNestry, Catherine ;
Wei, Lan ;
Antoniadi, Anna Markella ;
McAuliffe, Fionnuala M. ;
Mooney, Catherine .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2023, 173