Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research

被引:15
作者
Yagin, Burak [1 ]
Yagin, Fatma Hilal [1 ]
Colak, Cemil [1 ]
Inceoglu, Feyza [2 ]
Kadry, Seifedine [3 ,4 ,5 ]
Kim, Jungeun [6 ]
机构
[1] Inonu Univ, Fac Med, Dept Biostat & Med Informat, TR-44280 Malatya, Turkiye
[2] Malatya Turgut Ozal Univ, Fac Med, Dept Biostat, TR-44090 Malatya, Turkiye
[3] Noroff Univ Coll, Dept Appl Data Sci, N-4612 Kristiansand, Norway
[4] Ajman Univ, Artificial Intelligence Res Ctr AIRC, Ajman 346, U Arab Emirates
[5] Lebanese Amer Univ, Dept Elect & Comp Engn, Byblos 36, Lebanon
[6] Kongju Natl Univ, Dept Software, Cheonan 31080, South Korea
关键词
breast cancer metastasis; machine learning algorithms; genomic biomarkers; eXplainable artificial intelligence; SHAP; EXPRESSION; ASSOCIATION; PROGNOSIS;
D O I
10.3390/diagnostics13213314
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Aim: Method: This research presents a model combining machine learning (ML) techniques and eXplainable artificial intelligence (XAI) to predict breast cancer (BC) metastasis and reveal important genomic biomarkers in metastasis patients. Method: A total of 98 primary BC samples was analyzed, comprising 34 samples from patients who developed distant metastases within a 5-year follow-up period and 44 samples from patients who remained disease-free for at least 5 years after diagnosis. Genomic data were then subjected to biostatistical analysis, followed by the application of the elastic net feature selection method. This technique identified a restricted number of genomic biomarkers associated with BC metastasis. A light gradient boosting machine (LightGBM), categorical boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Gradient Boosting Trees (GBT), and Ada boosting (AdaBoost) algorithms were utilized for prediction. To assess the models' predictive abilities, the accuracy, F1 score, precision, recall, area under the ROC curve (AUC), and Brier score were calculated as performance evaluation metrics. To promote interpretability and overcome the "black box" problem of ML models, a SHapley Additive exPlanations (SHAP) method was employed. Results: The LightGBM model outperformed other models, yielding remarkable accuracy of 96% and an AUC of 99.3%. In addition to biostatistical evaluation, in XAI-based SHAP results, increased expression levels of TSPYL5, ATP5E, CA9, NUP210, SLC37A1, ARIH1, PSMD7, UBQLN1, PRAME, and UBE2T (p <= 0.05) were found to be associated with an increased incidence of BC metastasis. Finally, decreased levels of expression of CACTIN, TGFB3, SCUBE2, ARL4D, OR1F1, ALDH4A1, PHF1, and CROCC (p <= 0.05) genes were also determined to increase the risk of metastasis in BC. Conclusion: The findings of this study may prevent disease progression and metastases and potentially improve clinical outcomes by recommending customized treatment approaches for BC patients.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Explainable artificial intelligence for breast cancer: A visual case-based reasoning approach
    Lamy, Jean-Baptiste
    Sekar, Boomadevi
    Guezennec, Gilles
    Bouaud, Jacques
    Seroussi, Brigitte
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 94 : 42 - 53
  • [22] Prediction of Perforated and Nonperforated Acute Appendicitis Using Machine Learning-Based Explainable Artificial Intelligence
    Akbulut, Sami
    Yagin, Fatma Hilal
    Cicek, Ipek Balikci
    Koc, Cemalettin
    Colak, Cemil
    Yilmaz, Sezai
    DIAGNOSTICS, 2023, 13 (06)
  • [23] A Comparison of Machine Learning Methods for the Prediction of Breast Cancer
    Silva, Sara
    Anunciacao, Orlando
    Lotz, Marco
    EVOLUTIONARY COMPUTATION, MACHINE LEARNING AND DATA MINING IN BIOINFORMATICS, 2011, 6623 : 159 - +
  • [24] Peripheral blood mononuclear cell derived biomarker detection using eXplainable Artificial Intelligence (XAI) provides better diagnosis of breast cancer
    Kumar, Sunil
    Das, Asmita
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2023, 104
  • [25] Stress recognition identifying relevant facial action units through explainable artificial intelligence and machine learning
    Giannakakis, Giorgos
    Roussos, Anastasios
    Andreou, Christina
    Borgwardt, Stefan
    Korda, Alexandra I.
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2025, 259
  • [26] Untargeted Lipidomic Biomarkers for Liver Cancer Diagnosis: A Tree-Based Machine Learning Model Enhanced by Explainable Artificial Intelligence
    Colak, Cemil
    Yagin, Fatma Hilal
    Algarni, Abdulmohsen
    Algarni, Ali
    Al-Hashem, Fahaid
    Ardigo, Luca Paolo
    MEDICINA-LITHUANIA, 2025, 61 (03):
  • [27] The prediction of distant metastasis risk for male breast cancer patients based on an interpretable machine learning model
    Zhao, Xuhai
    Jiang, Cong
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2023, 23 (01)
  • [28] Machine learning prediction of breast cancer local recurrence localization, and distant metastasis after local recurrences
    Kovacs, Kristof Attila
    Kerepesi, Csaba
    Rapcsak, Dalma
    Madaras, Lilla
    Nagy, Akos
    Takacs, Aniko
    Dank, Magdolna
    Szentmartoni, Gyongyver
    Szasz, Attila Marcell
    Kulka, Janina
    Tokes, Anna Maria
    SCIENTIFIC REPORTS, 2025, 15 (01):
  • [29] HER2 classification in breast cancer cells: A new explainable machine learning application for immunohistochemistry
    Cordova, Claudio
    Munoz, Roberto
    Olivares, Rodrigo
    Minonzio, Jean-Gabriel
    Lozano, Carlo
    Gonzalez, Paulina
    Marchant, Ivanny
    Gonzalez-Arriagada, Wilfredo
    Olivero, Pablo
    ONCOLOGY LETTERS, 2023, 25 (02)
  • [30] Explainable machine learning model identified potential biomarkers in liver cancer survival prediction
    Pan, Qi
    Hounye, Alphonse Houssou
    Miao, Kexin
    Su, Liuyan
    Wang, Jiaoju
    Hou, Muzhou
    Xiong, Li
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 96