Cancer Metastasis Prediction and Genomic Biomarker Identification through Machine Learning and eXplainable Artificial Intelligence in Breast Cancer Research

被引:15
作者
Yagin, Burak [1 ]
Yagin, Fatma Hilal [1 ]
Colak, Cemil [1 ]
Inceoglu, Feyza [2 ]
Kadry, Seifedine [3 ,4 ,5 ]
Kim, Jungeun [6 ]
机构
[1] Inonu Univ, Fac Med, Dept Biostat & Med Informat, TR-44280 Malatya, Turkiye
[2] Malatya Turgut Ozal Univ, Fac Med, Dept Biostat, TR-44090 Malatya, Turkiye
[3] Noroff Univ Coll, Dept Appl Data Sci, N-4612 Kristiansand, Norway
[4] Ajman Univ, Artificial Intelligence Res Ctr AIRC, Ajman 346, U Arab Emirates
[5] Lebanese Amer Univ, Dept Elect & Comp Engn, Byblos 36, Lebanon
[6] Kongju Natl Univ, Dept Software, Cheonan 31080, South Korea
关键词
breast cancer metastasis; machine learning algorithms; genomic biomarkers; eXplainable artificial intelligence; SHAP; EXPRESSION; ASSOCIATION; PROGNOSIS;
D O I
10.3390/diagnostics13213314
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Aim: Method: This research presents a model combining machine learning (ML) techniques and eXplainable artificial intelligence (XAI) to predict breast cancer (BC) metastasis and reveal important genomic biomarkers in metastasis patients. Method: A total of 98 primary BC samples was analyzed, comprising 34 samples from patients who developed distant metastases within a 5-year follow-up period and 44 samples from patients who remained disease-free for at least 5 years after diagnosis. Genomic data were then subjected to biostatistical analysis, followed by the application of the elastic net feature selection method. This technique identified a restricted number of genomic biomarkers associated with BC metastasis. A light gradient boosting machine (LightGBM), categorical boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Gradient Boosting Trees (GBT), and Ada boosting (AdaBoost) algorithms were utilized for prediction. To assess the models' predictive abilities, the accuracy, F1 score, precision, recall, area under the ROC curve (AUC), and Brier score were calculated as performance evaluation metrics. To promote interpretability and overcome the "black box" problem of ML models, a SHapley Additive exPlanations (SHAP) method was employed. Results: The LightGBM model outperformed other models, yielding remarkable accuracy of 96% and an AUC of 99.3%. In addition to biostatistical evaluation, in XAI-based SHAP results, increased expression levels of TSPYL5, ATP5E, CA9, NUP210, SLC37A1, ARIH1, PSMD7, UBQLN1, PRAME, and UBE2T (p <= 0.05) were found to be associated with an increased incidence of BC metastasis. Finally, decreased levels of expression of CACTIN, TGFB3, SCUBE2, ARL4D, OR1F1, ALDH4A1, PHF1, and CROCC (p <= 0.05) genes were also determined to increase the risk of metastasis in BC. Conclusion: The findings of this study may prevent disease progression and metastases and potentially improve clinical outcomes by recommending customized treatment approaches for BC patients.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Advances in Machine Learning and Explainable Artificial Intelligence for Depression Prediction
    Byeon, Haewon
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (06) : 520 - 526
  • [2] Explainable artificial intelligence for microbiome data analysis in colorectal cancer biomarker identification
    Novielli, Pierfrancesco
    Romano, Donato
    Magarelli, Michele
    Di Bitonto, Pierpaolo
    Diacono, Domenico
    Chiatante, Annalisa
    Lopalco, Giuseppe
    Sabella, Daniele
    Venerito, Vincenzo
    Filannino, Pasquale
    Bellotti, Roberto
    De Angelis, Maria
    Iannone, Florenzo
    Tangaro, Sabina
    FRONTIERS IN MICROBIOLOGY, 2024, 15
  • [3] Biomarker discovery and development of prognostic prediction model using metabolomic panel in breast cancer patients: a hybrid methodology integrating machine learning and explainable artificial intelligence
    Yagin, Fatma Hilal
    Gormez, Yasin
    Al-Hashem, Fahaid
    Ahmad, Irshad
    Ahmad, Fuzail
    Ardigo, Luca Paolo
    FRONTIERS IN MOLECULAR BIOSCIENCES, 2024, 11
  • [4] An Explainable Artificial Intelligence Model for the Classification of Breast Cancer
    Khater, Tarek
    Hussain, Abir
    Bendardaf, Riyad
    Talaat, Iman M.
    Tawfik, Hissam
    Ansari, Sam
    Mahmoud, Soliman
    IEEE ACCESS, 2025, 13 : 5618 - 5633
  • [5] Explainable artificial intelligence in breast cancer detection and risk prediction: A systematic scoping review
    Ghasemi, Amirehsan
    Hashtarkhani, Soheil
    Schwartz, David L.
    Shaban-Nejad, Arash
    CANCER INNOVATION, 2024, 3 (05):
  • [6] Mortality Prediction Modeling for Patients with Breast Cancer Based on Explainable Machine Learning
    Park, Sang Won
    Park, Ye-Lin
    Lee, Eun-Gyeong
    Chae, Heejung
    Park, Phillip
    Choi, Dong-Woo
    Choi, Yeon Ho
    Hwang, Juyeon
    Ahn, Seohyun
    Kim, Keunkyun
    Kim, Woo Jin
    Kong, Sun-Young
    Jung, So-Youn
    Kim, Hyun-Jin
    CANCERS, 2024, 16 (22)
  • [7] Feature Selection in Cancer Classification: Utilizing Explainable Artificial Intelligence to Uncover Influential Genes in Machine Learning Models
    Dalmolin, Matheus
    Azevedo, Karolayne S.
    de Souza, Luisa C.
    de Farias, Caroline B.
    Lichtenfels, Martina
    Fernandes, Marcelo A. C.
    AI, 2025, 6 (01) : 2 - 0
  • [8] Incorporation of explainable artificial intelligence in ensemble machine learning-driven pancreatic cancer diagnosis
    Faisal Abdulaziz Almisned
    Natacha Usanase
    Dilber Uzun Ozsahin
    Ilker Ozsahin
    Scientific Reports, 15 (1)
  • [9] Prediction of Breast Cancer Distant Metastasis by Artificial Intelligence Methods from an Epidemiological Perspective
    Akbulut, Sami
    Yagin, Fatma Hilal
    Colak, Cemil
    ISTANBUL MEDICAL JOURNAL, 2022, 23 (03): : 210 - 215
  • [10] A Roadmap towards Breast Cancer Therapies Supported by Explainable Artificial Intelligence
    Amoroso, Nicola
    Pomarico, Domenico
    Fanizzi, Annarita
    Didonna, Vittorio
    Giotta, Francesco
    La Forgia, Daniele
    Latorre, Agnese
    Monaco, Alfonso
    Pantaleo, Ester
    Petruzzellis, Nicole
    Tamborra, Pasquale
    Zito, Alfredo
    Lorusso, Vito
    Bellotti, Roberto
    Massafra, Raffaella
    APPLIED SCIENCES-BASEL, 2021, 11 (11):