Explainable machine learning model identified potential biomarkers in liver cancer survival prediction

被引:1
作者
Pan, Qi [1 ]
Hounye, Alphonse Houssou [1 ]
Miao, Kexin [1 ]
Su, Liuyan [1 ]
Wang, Jiaoju [1 ]
Hou, Muzhou [1 ]
Xiong, Li [2 ,3 ]
机构
[1] Cent South Univ, Sch Math & Stat, Changsha 410083, Peoples R China
[2] Cent South Univ, Xiangya Hosp 2, Dept Gen Surg, Changsha 410011, Peoples R China
[3] Hunan Clin Res Ctr Intelligent Gen Surg, Changsha 410011, Peoples R China
关键词
Random Forest; XGBoost; Support Vector Machine(SVM); SHAP; Immunogenic Cell Death (ICD); Prognostic model; CEP55;
D O I
10.1016/j.bspc.2024.106504
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Objective: Liver cancer is a malignant tumor with a high incidence, and common treatments include surgical resection, ablation, arterial catheterization, and liver transplantation. Enhancing the clinical evaluation and therapy management of LIHC is a crucial matter, and when incorporating machine learning methods into decision-making procedures, it is crucial to consider the comprehensibility of the models. In this current study, the SHapley Additive exPlanation (SHAP) technique was applied to interpret a gradient-boosting decision tree (XGBoost) model utilizing the Cancer Genome Atlas (TCGA) data for interpreting survival black-box models to identify the potential biomarkers for liver cancer survival prediction. Methods: The TCGA database is utilized to access expression data and clinical information for liver cancer samples, while Immunogenic Cell Death (ICD)-related genes were retrieved from the literature. Gene screening using bioinformatics methods and machine learning methods. The screened differentially expressed genes (DEGs) and ICDs were jointly constructed as the SurvMLSHAP model, and the SurvMLSHAP score was calculated. Three methods, bayesian optimization, random search, and genetic algorithm were used for parameter optimization. Eight machine learning models were built to evaluate the model's superiority and select the best model based on the suggested model. Results: The SurvMLSHAP model output was interpreted using the XGBoost-based SHAP method to assess the influence and significance of each feature. Tests conducted on both synthetic and medical data validate the capability of SurvMLSHAP to identify factors that have a time-dependent impact. The C-index of the raw data and validation data were 0.6844 and 0.8167, respectively. Furthermore, the aggregation of SurvMLSHAP yields a more accurate assessment of variable relevance for prediction compared to other existing approaches. The features contributing to the XGBoost model were, in order CEP55, PPIA, TTC36, HSP90AA1, which could be used as predictors to assess the liver hepatocellular carcinoma(LIHC) cohort, while the putative molecular subgroups could provide new ideas for individualized treatment of LIHC. Conclusion: In this study, a risk prognostic model was constructed called SurvMLSHAP based on bioinformatics and machine learning methods and screened for ICD-related biomarkers to assess the prognostic outcome of LIHC patients, which can provide personalized treatment for clinical patients.
引用
收藏
页数:17
相关论文
共 33 条
  • [11] CEP55 Inhibitor: Extensive Computational Approach Defining a New Target of Cell Cycle Machinery Agent
    Lestari, Beni
    Utomo, Rohmad Yudi
    [J]. ADVANCED PHARMACEUTICAL BULLETIN, 2022, 12 (01) : 191 - 199
  • [12] GSCA: an integrated platform for gene set cancer analysis at genomic, pharmacogenomic and immunogenomic levels
    Liu, Chun-Jie
    Hu, Fei-Fei
    Xie, Gui-Yan
    Miao, Ya-Ru
    Li, Xin-Wen
    Zeng, Yan
    Guo, An -Yuan
    [J]. BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [13] Plasma HSP90AA1 Predicts the Risk of Breast Cancer Onset and Distant Metastasis
    Liu, Haizhou
    Zhang, Zihan
    Huang, Yi
    Wei, Wene
    Ning, Shufang
    Li, Jilin
    Liang, Xinqiang
    Liu, Kaisheng
    Zhang, Litu
    [J]. FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2021, 9
  • [14] SP1-induced upregulation of lncRNA CTBP1-AS2 accelerates the hepatocellular carcinoma tumorigenesis through targeting CEP55 via sponging miR-195-5p
    Liu, Li-xia
    Liu, Bin
    Yu, Jie
    Zhang, Dong-yun
    Shi, Jian-hong
    Liang, Ping
    [J]. BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2020, 533 (04) : 779 - 785
  • [15] Lundberg SM, 2017, ADV NEUR IN, V30
  • [16] Mohapatra S., 2022, Sustainable Operations and Computers, V3, P296, DOI [10.1016/j.susoc.2022.06.001, DOI 10.1016/J.SUSOC.2022.06.001]
  • [17] A comparative knowledge base development for cancerous cell detection based on deep learning and fuzzy computer vision approach
    Mohapatra, Subhasish
    Satpathy, Suneeta
    Mohanty, Sachi Nandan
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (17) : 24799 - 24814
  • [18] Global burden of primary liver cancer in 2020 and predictions to 2040
    Rumgay, Harriet
    Arnold, Melina
    Ferlay, Jacques
    Lesi, Olufunmilayo
    Cabasag, Citadel J.
    Vignat, Jerome
    Laversanne, Mathieu
    McGlynn, Katherine A.
    Soerjomataram, Isabelle
    [J]. JOURNAL OF HEPATOLOGY, 2022, 77 (06) : 1598 - 1606
  • [19] Advanced network pharmacology study reveals multi-pathway and multi-gene regulatory molecular mechanism of Bacopa monnieri in liver cancer based on data mining, molecular modeling, and microarray data analysis
    Sadaqat, Muhammad
    Qasim, Muhammad
    ul Qamar, Muhammad Tahir
    Masoud, Muhammad Shareef
    Ashfaq, Usman Ali
    Noor, Fatima
    Fatima, Kinza
    Allemailem, Khaled S.
    Alrumaihi, Faris
    Almatroudi, Ahmad
    [J]. COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 161
  • [20] Emerging molecular markers of cancer
    Sidransky, D
    [J]. NATURE REVIEWS CANCER, 2002, 2 (03) : 210 - 219