Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research

被引:15
作者
Yagin, Burak [1 ]
Yagin, Fatma Hilal [1 ]
Colak, Cemil [1 ]
Inceoglu, Feyza [2 ]
Kadry, Seifedine [3 ,4 ,5 ]
Kim, Jungeun [6 ]
机构
[1] Inonu Univ, Fac Med, Dept Biostat & Med Informat, TR-44280 Malatya, Turkiye
[2] Malatya Turgut Ozal Univ, Fac Med, Dept Biostat, TR-44090 Malatya, Turkiye
[3] Noroff Univ Coll, Dept Appl Data Sci, N-4612 Kristiansand, Norway
[4] Ajman Univ, Artificial Intelligence Res Ctr AIRC, Ajman 346, U Arab Emirates
[5] Lebanese Amer Univ, Dept Elect & Comp Engn, Byblos 36, Lebanon
[6] Kongju Natl Univ, Dept Software, Cheonan 31080, South Korea
关键词
breast cancer metastasis; machine learning algorithms; genomic biomarkers; eXplainable artificial intelligence; SHAP; EXPRESSION; ASSOCIATION; PROGNOSIS;
D O I
10.3390/diagnostics13213314
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Aim: Method: This research presents a model combining machine learning (ML) techniques and eXplainable artificial intelligence (XAI) to predict breast cancer (BC) metastasis and reveal important genomic biomarkers in metastasis patients. Method: A total of 98 primary BC samples was analyzed, comprising 34 samples from patients who developed distant metastases within a 5-year follow-up period and 44 samples from patients who remained disease-free for at least 5 years after diagnosis. Genomic data were then subjected to biostatistical analysis, followed by the application of the elastic net feature selection method. This technique identified a restricted number of genomic biomarkers associated with BC metastasis. A light gradient boosting machine (LightGBM), categorical boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Gradient Boosting Trees (GBT), and Ada boosting (AdaBoost) algorithms were utilized for prediction. To assess the models' predictive abilities, the accuracy, F1 score, precision, recall, area under the ROC curve (AUC), and Brier score were calculated as performance evaluation metrics. To promote interpretability and overcome the "black box" problem of ML models, a SHapley Additive exPlanations (SHAP) method was employed. Results: The LightGBM model outperformed other models, yielding remarkable accuracy of 96% and an AUC of 99.3%. In addition to biostatistical evaluation, in XAI-based SHAP results, increased expression levels of TSPYL5, ATP5E, CA9, NUP210, SLC37A1, ARIH1, PSMD7, UBQLN1, PRAME, and UBE2T (p <= 0.05) were found to be associated with an increased incidence of BC metastasis. Finally, decreased levels of expression of CACTIN, TGFB3, SCUBE2, ARL4D, OR1F1, ALDH4A1, PHF1, and CROCC (p <= 0.05) genes were also determined to increase the risk of metastasis in BC. Conclusion: The findings of this study may prevent disease progression and metastases and potentially improve clinical outcomes by recommending customized treatment approaches for BC patients.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods
    He, Zongzhen
    Zhang, Junying
    Yuan, Xiguo
    Zhang, Yuanyuan
    FRONTIERS IN GENETICS, 2021, 11
  • [42] Common cancer biomarkers of breast and ovarian types identified through artificial intelligence
    Pawar, Shrikant
    Liew, Tuck Onn
    Stanam, Aditya
    Lahiri, Chandrajit
    CHEMICAL BIOLOGY & DRUG DESIGN, 2020, 96 (03) : 995 - 1004
  • [43] DeepXplainer: An interpretable deep learning based approach for lung cancer detection using explainable artificial intelligence
    Wani, Niyaz Ahmad
    Kumar, Ravinder
    Bedi, Jatin
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 243
  • [44] Genomic signatures for paclitaxel and gemcitabine resistance in breast cancer derived by machine learning
    Dorman, Stephanie N.
    Baranoua, Katherina
    Knoll, Joan H. M.
    Urquhart, Brad L.
    Mariani, Gabriella
    Carcangiu, Maria Luisa
    Rogan, Peter K.
    MOLECULAR ONCOLOGY, 2016, 10 (01) : 85 - 100
  • [45] Prediction of Complex Odor from Pig Barn Using Machine Learning and Identifying the Influence of Variables Using Explainable Artificial Intelligence
    Lee, Do-Hyun
    Lee, Sang-Hun
    Woo, Saem-Ee
    Jung, Min-Woong
    Kim, Do-yun
    Heo, Tae-Young
    APPLIED SCIENCES-BASEL, 2022, 12 (24):
  • [46] Prediction of Breast Cancer, Comparative Review of Machine Learning Techniques, and Their Analysis
    Fatima, Noreen
    Liu, Li
    Hong, Sha
    Ahmed, Haroon
    IEEE ACCESS, 2020, 8 : 150360 - 150376
  • [47] Epidemiological breast cancer prediction by country: A novel machine learning approach
    El Haji, Hasna
    Sbihi, Nada
    Guermah, Bassma
    Souadka, Amine
    Ghogho, Mounir
    PLOS ONE, 2024, 19 (08):
  • [48] Machine Learning techniques in breast cancer prognosis prediction: A primary evaluation
    Boeri, Carlo
    Chiappa, Corrado
    Galli, Federica
    De Berardinis, Valentina
    Bardelli, Laura
    Carcano, Giulio
    Rovera, Francesca
    CANCER MEDICINE, 2020, 9 (09): : 3234 - 3243
  • [49] A Decision Support System for Wheat Powdery Mildew Risk Prediction Using Weather Monitoring, Machine Learning and Explainable Artificial Intelligence
    Diachenko, Grygorii
    Laktionov, Ivan
    Vinyukov, Oleksandr
    Likhushyna, Hanna
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2025, 230
  • [50] Addressing the Clinical Feasibility of Adopting Circulating miRNA for Breast Cancer Detection, Monitoring and Management with Artificial Intelligence and Machine Learning Platforms
    Ling, Lloyd
    Aldoghachi, Ahmed Faris
    Chong, Zhi Xiong
    Ho, Wan Yong
    Yeap, Swee Keong
    Chin, Ren Jie
    Soo, Eugene Zhen Xiang
    Khor, Jen Feng
    Yong, Yoke Leng
    Ling, Joan Lucille
    Yan, Naing Soe
    Ong, Alan Han Kiat
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (23)