DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery

被引:15
作者
Lan, Wei [1 ]
Liao, Haibo [2 ]
Chen, Qingfeng [3 ]
Zhu, Lingzhi [4 ]
Pan, Yi [5 ]
Chen, Yi-Ping Phoebe [6 ]
机构
[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning, Peoples R China
[2] Guangxi Univ, Comp Technol, Nanning, Peoples R China
[3] Guangxi Univ, State Key Lab Conservat & Utilizat Subtrop Agrobio, Nanning, Peoples R China
[4] Hunan Inst Technol, Sch Comp & Informat Sci, Hengyang 421002, Peoples R China
[5] Chinese Acad Sci, Shenzhen Inst Adv Technol, Sch Comp Sci & Control Engn, Shenzhen, Peoples R China
[6] La Trobe Univ, Dept Comp Sci & Informat Technol, Bundoora, Vic, Australia
基金
中国国家自然科学基金;
关键词
cancer recurrence prediction; interpretability of deep learning; self-attention mechanism; multi-omics data integration; HEPATOCELLULAR-CARCINOMA; BLADDER-CANCER; SIGNALING PATHWAY; PROLIFERATION; ACTIVATION; SURVIVAL;
D O I
10.1093/bib/bbae185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Cancer subtyping with heterogeneous multi-omics data via hierarchical multi -kernel learning
    Wei, Yifang
    Li, Lingmei
    Zhao, Xin
    Yang, Haitao
    Sa, Jian
    Cao, Hongyan
    Cui, Yuehua
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [32] A deep learning approach based on multi-omics data integration to construct a risk stratification prediction model for skin cutaneous melanoma
    Weijia Li
    Qiao Huang
    Yi Peng
    Suyue Pan
    Min Hu
    Pu Wang
    Yuqing He
    Journal of Cancer Research and Clinical Oncology, 2023, 149 : 15923 - 15938
  • [33] A deep learning approach based on multi-omics data integration to construct a risk stratification prediction model for skin cutaneous melanoma
    Li, Weijia
    Huang, Qiao
    Peng, Yi
    Pan, Suyue
    Hu, Min
    Wang, Pu
    He, Yuqing
    JOURNAL OF CANCER RESEARCH AND CLINICAL ONCOLOGY, 2023, 149 (17) : 15923 - 15938
  • [34] Integration of multi-omics and clinical treatment data reveals bladder cancer therapeutic vulnerability gene combinations and prognostic risks
    Xu, Yan
    Sun, Xiaoyu
    Liu, Guangxu
    Li, Hongze
    Yu, Meng
    Zhu, Yuyan
    FRONTIERS IN IMMUNOLOGY, 2024, 14
  • [35] Multi-omics data integration in the Cloud: Analysis of Statistically Significant Associations Between Clinical and Molecular Features in Breast Cancer
    Abdilleh, Kawther
    Aguilar, Boris
    Thomson, J. Ross
    ACM-BCB 2020 - 11TH ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2020,
  • [36] Integrative analysis of multi-omics data for discovery of ferroptosis-related gene signature predicting immune activity in neuroblastoma
    Hu, Jiajian
    Song, Fengju
    Kang, Wenjuan
    Xia, Fantong
    Song, Zi'an
    Wang, Yangyang
    Li, Jie
    Zhao, Qiang
    FRONTIERS IN PHARMACOLOGY, 2023, 14
  • [37] A Multi-Omics Approach to Liver Diseases: Integration of Single Nuclei Transcriptomics with Proteomics and HiCap Bulk Data in Human Liver
    Cavalli, Marco
    Diamanti, Klev
    Pan, Gang
    Spalinskas, Rapolas
    Kumar, Chanchal
    Deshmukh, Atul Shahaji
    Mann, Matthias
    Sahlen, Pelin
    Komorowski, Jan
    Wadelius, Claes
    OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2020, 24 (04) : 180 - 194
  • [38] Amogel: a multi-omics classification framework using associative graph neural networks with prior knowledge for biomarker identification
    Tan, Chia Yan
    Ong, Huey Fang
    Lim, Chern Hong
    Tan, Mei Sze
    Ooi, Ean Hin
    Wong, Koksheik
    BMC BIOINFORMATICS, 2025, 26 (01):
  • [39] MSPL: Multimodal Self-Paced Learning for Multi-Omics Feature Selection and Data Integration
    Yang, Zi-Yi
    Xia, Liang-Yong
    Zhang, Hui
    Liang, Yong
    IEEE ACCESS, 2019, 7 : 170513 - 170524
  • [40] Prior knowledge-guided multilevel graph neural network for tumor risk prediction and interpretation via multi-omics data integration
    Yan, Hongxi
    Weng, Dawei
    Li, Dongguo
    Gu, Yu
    Ma, Wenji
    Liu, Qingjie
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)