DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery

被引:15
作者
Lan, Wei [1 ]
Liao, Haibo [2 ]
Chen, Qingfeng [3 ]
Zhu, Lingzhi [4 ]
Pan, Yi [5 ]
Chen, Yi-Ping Phoebe [6 ]
机构
[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning, Peoples R China
[2] Guangxi Univ, Comp Technol, Nanning, Peoples R China
[3] Guangxi Univ, State Key Lab Conservat & Utilizat Subtrop Agrobio, Nanning, Peoples R China
[4] Hunan Inst Technol, Sch Comp & Informat Sci, Hengyang 421002, Peoples R China
[5] Chinese Acad Sci, Shenzhen Inst Adv Technol, Sch Comp Sci & Control Engn, Shenzhen, Peoples R China
[6] La Trobe Univ, Dept Comp Sci & Informat Technol, Bundoora, Vic, Australia
基金
中国国家自然科学基金;
关键词
cancer recurrence prediction; interpretability of deep learning; self-attention mechanism; multi-omics data integration; HEPATOCELLULAR-CARCINOMA; BLADDER-CANCER; SIGNALING PATHWAY; PROLIFERATION; ACTIVATION; SURVIVAL;
D O I
10.1093/bib/bbae185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] PaCMAP-embedded convolutional neural network for multi-omics data integration
    Qattous, Hazem
    Azzeh, Mohammad
    Ibrahim, Rahmeh
    Al-Ghafer, Ibrahim Abed
    Al Sorkhy, Mohammad
    Alkhateeb, Abedalrhman
    HELIYON, 2024, 10 (01)
  • [22] Multi-omics differential gene regulatory network inference for lung adenocarcinoma tumor progression biomarker discovery
    Tong, Yi-Fan
    He, Qi-En
    Zhu, Jun-Xuan
    Ding, En-Ci
    Song, Kai
    AICHE JOURNAL, 2022, 68 (04)
  • [23] Inferring Dysregulated Pathways of Driving Cancer Subtypes Through Multi-omics Integration
    Shi, Kai
    Gao, Lin
    Wang, Bingbo
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2018, 2018, 10847 : 101 - 112
  • [24] Lactylation Modification as a Promoter of Bladder Cancer: Insights from Multi-Omics Analysis
    He, Yipeng
    Xiang, Lingyan
    Yuan, Jingping
    Yan, Honglin
    CURRENT ISSUES IN MOLECULAR BIOLOGY, 2024, 46 (11) : 12866 - 12885
  • [25] CAncer bioMarker Prediction Pipeline (CAMPP)-A standardized framework for the analysis of quantitative biological data
    Terkelsen, Thilde
    Krogh, Anders
    Papaleo, Elena
    PLOS COMPUTATIONAL BIOLOGY, 2020, 16 (03)
  • [26] Molecular Subtyping of Serous Ovarian Cancer Based on Multi-omics Data
    Zhang, Zhe
    Huang, Ke
    Gu, Chenglei
    Zhao, Luyang
    Wang, Nan
    Wang, Xiaolei
    Zhao, Dongsheng
    Zhang, Chenggang
    Lu, Yiming
    Meng, Yuanguang
    SCIENTIFIC REPORTS, 2016, 6
  • [27] A multi-omics machine learning framework in predicting the survival of colorectal cancer patients
    Yang, Min
    Yang, Huandong
    Ji, Lei
    Hu, Xuan
    Tian, Geng
    Wang, Bing
    Yang, Jialiang
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146
  • [28] Survival stratification for colorectal cancer via multi-omics integration using an autoencoder-based model
    Song, Hu
    Ruan, Chengwei
    Xu, Yixin
    Xu, Teng
    Fan, Ruizhi
    Jiang, Tao
    Cao, Meng
    Song, Jun
    EXPERIMENTAL BIOLOGY AND MEDICINE, 2022, 247 (11) : 898 - 909
  • [29] Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
    ElKarami, Bashier
    Alkhateeb, Abedalrhman
    Qattous, Hazem
    Alshomali, Lujain
    Shahrrava, Behnam
    CANCER INFORMATICS, 2022, 21
  • [30] Multi-omics Data Integration Model Based on UMAP Embedding and Convolutional Neural Network
    ElKarami, Bashier
    Alkhateeb, Abedalrhman
    Qattous, Hazem
    Alshomali, Lujain
    Shahrrava, Behnam
    CANCER INFORMATICS, 2022, 21