Machine learning-based classifiers to predict metastasis in colorectal cancer patients

被引:8
作者
Talebi, Raheleh [1 ,2 ]
Celis-Morales, Carlos A. [3 ,4 ]
Akbari, Abolfazl [5 ]
Talebi, Atefeh [5 ,6 ]
Borumandnia, Nasrin [7 ]
Pourhoseingholi, Mohamad Amin [8 ]
机构
[1] Univ Appl Sci & Technol, Dept Pure Math, Unit 10, Tehran, Iran
[2] Univ Appl Sci & Technol, Math Architecture & Comp Engn Dept, Unit 10, Tehran, Iran
[3] Univ Glasgow, Sch Cardiovasc & Metab Hlth, Glasgow, Scotland
[4] Univ Catolica Maule, Human Performance Lab, Educ Phys Act & Hlth Res Unit, Talca, Chile
[5] Iran Univ Med Sci, Colorectal Res Ctr, Tehran, Iran
[6] Univ Glasgow, British Heart Fdn, Cardiovasc Res Ctr, Glasgow, Scotland
[7] Shahid Beheshti Univ Med Sci, Urol & Nephrol Res Ctr, Tehran, Iran
[8] Shahid Beheshti Univ Med Sci, Res Inst Gastroenterol & Liver Dis, Gastroenterol & Liver Dis Res Ctr, Tehran, Iran
来源
FRONTIERS IN ARTIFICIAL INTELLIGENCE | 2024年 / 7卷
关键词
colorectal cancer; machine learning; metastasis; model performance and validation; balance data; MODEL;
D O I
10.3389/frai.2024.1285037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Background The increasing prevalence of colorectal cancer (CRC) in Iran over the past three decades has made it a key public health burden. This study aimed to predict metastasis in CRC patients using machine learning (ML) approaches in terms of demographic and clinical factors.Methods This study focuses on 1,127 CRC patients who underwent appropriate treatments at Taleghani Hospital, a tertiary care facility. The patients were divided into training and test datasets in an 80:20 ratio. Various ML methods, including Naive Bayes (NB), random rorest (RF), support vector machine (SVM), neural network (NN), decision tree (DT), and logistic regression (LR), were used for predicting metastasis in CRC patients. Model performance was evaluated using 5-fold cross-validation, reporting sensitivity, specificity, the area under the curve (AUC), and other indexes.Results Among the 1,127 patients, 183 (16%) had experienced metastasis. In the predictionof metastasis, both the NN and RF algorithms had the highest AUC, while SVM ranked third in both the original and balanced datasets. The NN and RF algorithms achieved the highest AUC (100%), sensitivity (100% and 100%, respectively), and accuracy (99.2% and 99.3%, respectively) on the balanced dataset, followed by the SVM with an AUC of 98.8%, a sensitivity of 97.5%, and an accuracy of 97%. Moreover, lower false negative rate (FNR), false positive rate (FPR), and higher negative predictive value (NPV) can be confirmed by these two methods. The results also showed that all methods exhibited good performance in the test datasets, and the balanced dataset improved the performance of most ML methods. The most important variables for predicting metastasis were the tumor stage, the number of involved lymph nodes, and the treatment type. In a separate analysis of patients with tumor stages I-III, it was identified that tumor grade, tumor size, and tumor stage are the most important features.Conclusion This study indicated that NN and RF were the best among ML-based approaches for predicting metastasis in CRC patients. Both the tumor stage and the number of involved lymph nodes were considered the most important features.
引用
收藏
页数:9
相关论文
共 20 条
[1]   Predicting Colorectal Cancer Recurrence and Patient Survival Using Supervised Machine Learning Approach: A South African Population-Based Study [J].
Achilonu, Okechinyere J. ;
Fabian, June ;
Bebington, Brendan ;
Singh, Elvira ;
Eijkemans, M. J. C. ;
Musenge, Eustasius .
FRONTIERS IN PUBLIC HEALTH, 2021, 9
[2]  
Anuraga G., 2019, J. Phys.: Conf. Ser., V1217, DOI [10.1088/1742-6596/1217/1/012098, DOI 10.1088/1742-6596/1217/1/012098]
[3]   Nomogram to Predict the Overall Survival of Colorectal Cancer Patients: A Multicenter National Study [J].
Borumandnia, Nasrin ;
Doosti, Hassan ;
Jalali, Amirhossein ;
Khodakarim, Soheila ;
Charati, Jamshid Yazdani ;
Pourhoseingholi, Mohamad Amin ;
Talebi, Atefeh ;
Agah, Shahram .
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2021, 18 (15)
[4]   Development of a Model for Predicting Early Discontinuation of Adjuvant Chemotherapy in Stage III Colon Cancer [J].
Boyne, Devon J. ;
Brenner, Darren R. ;
Sajobi, Tolulope T. ;
Hilsden, Robert J. ;
Yusuf, Dimas ;
Xu, Yuan ;
Friedenreich, Christine M. ;
Cheung, Winson Y. .
JCO CLINICAL CANCER INFORMATICS, 2020, 4 :972-984
[5]   A comparative study on feature selection for a risk prediction model for colorectal cancer [J].
Cueto-Lopez, Nahum ;
Teresa Garcia-Ordas, Maria ;
Davila-Batista, Veronica ;
Moreno, Victor ;
Aragones, Nduria ;
Alaiz-Rodriguez, Rocio .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2019, 177 :219-229
[6]   Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods [J].
Ferlay, J. ;
Colombet, M. ;
Soerjomataram, I. ;
Mathers, C. ;
Parkin, D. M. ;
Pineros, M. ;
Znaor, A. ;
Bray, F. .
INTERNATIONAL JOURNAL OF CANCER, 2019, 144 (08) :1941-1953
[7]   A guide to machine learning for biologists [J].
Greener, Joe G. ;
Kandathil, Shaun M. ;
Moffat, Lewis ;
Jones, David T. .
NATURE REVIEWS MOLECULAR CELL BIOLOGY, 2022, 23 (01) :40-55
[8]   Prediction of Colon Cancer Stages and Survival Period with Machine Learning Approach [J].
Gupta, Pushpanjali ;
Chiang, Sum-Fu ;
Sahoo, Prasan Kumar ;
Mohapatra, Suvendu Kumar ;
You, Jeng-Fu ;
Onthoni, Djeane Debora ;
Hung, Hsin-Yuan ;
Chiang, Jy-Ming ;
Huang, Yenlin ;
Tsai, Wen-Sy .
CANCERS, 2019, 11 (12)
[9]   Machine learning applications in cancer prognosis and prediction [J].
Kourou, Konstantina ;
Exarchos, Themis P. ;
Exarchos, Konstantinos P. ;
Karamouzis, Michalis V. ;
Fotiadis, Dimitrios I. .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2015, 13 :8-17
[10]   A Novel Data-Driven Prognostic Model for Staging of Colorectal Cancer [J].
Manilich, Elena A. ;
Kiran, Ravi P. ;
Radivoyevitch, Tomas ;
Lavery, Ian ;
Fazio, Victor W. ;
Remzi, Feza H. .
JOURNAL OF THE AMERICAN COLLEGE OF SURGEONS, 2011, 213 (05) :579-+