DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery

被引:14
|
作者
Lan, Wei [1 ]
Liao, Haibo [2 ]
Chen, Qingfeng [3 ]
Zhu, Lingzhi [4 ]
Pan, Yi [5 ]
Chen, Yi-Ping Phoebe [6 ]
机构
[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning, Peoples R China
[2] Guangxi Univ, Comp Technol, Nanning, Peoples R China
[3] Guangxi Univ, State Key Lab Conservat & Utilizat Subtrop Agrobio, Nanning, Peoples R China
[4] Hunan Inst Technol, Sch Comp & Informat Sci, Hengyang 421002, Peoples R China
[5] Chinese Acad Sci, Shenzhen Inst Adv Technol, Sch Comp Sci & Control Engn, Shenzhen, Peoples R China
[6] La Trobe Univ, Dept Comp Sci & Informat Technol, Bundoora, Vic, Australia
基金
中国国家自然科学基金;
关键词
cancer recurrence prediction; interpretability of deep learning; self-attention mechanism; multi-omics data integration; HEPATOCELLULAR-CARCINOMA; BLADDER-CANCER; SIGNALING PATHWAY; PROLIFERATION; ACTIVATION; SURVIVAL;
D O I
10.1093/bib/bbae185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.
引用
收藏
页数:16
相关论文
共 50 条
  • [21] Prediction of plant complex traits via integration of multi-omics data
    Wang, Peipei
    Lehti-Shiu, Melissa D.
    Lotreck, Serena
    Aba, Kenia Segura
    Krysan, Patrick J.
    Shiu, Shin-Han
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [22] COMO: a pipeline for multi-omics data integration in metabolic modeling and drug discovery
    Bessell, Brandt
    Loecker, Josh
    Zhao, Zhongyuan
    Aghamiri, Sara Sadat
    Mohanty, Sabyasachi
    Amin, Rada
    Helikar, Tomas
    Puniya, Bhanwar Lal
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (06)
  • [23] Improving prediction performance of colon cancer prognosis based on the integration of clinical and multi-omics data
    Tong, Danyang
    Tian, Yu
    Zhou, Tianshu
    Ye, Qiancheng
    Li, Jun
    Ding, Kefeng
    Li, Jingsong
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2020, 20 (01)
  • [24] Improving prediction performance of colon cancer prognosis based on the integration of clinical and multi-omics data
    Danyang Tong
    Yu Tian
    Tianshu Zhou
    Qiancheng Ye
    Jun Li
    Kefeng Ding
    Jingsong Li
    BMC Medical Informatics and Decision Making, 20
  • [25] Deep-Learning Algorithm and Concomitant Biomarker Identification for NSCLC Prediction Using Multi-Omics Data Integration
    Park, Min-Koo
    Lim, Jin-Muk
    Jeong, Jinwoo
    Jang, Yeongjae
    Lee, Ji-Won
    Lee, Jeong-Chan
    Kim, Hyungyu
    Koh, Euiyul
    Hwang, Sung-Joo
    Kim, Hong-Gee
    Kim, Keun-Cheol
    BIOMOLECULES, 2022, 12 (12)
  • [26] A cloud solution for multi-omics data integration
    Tordini, Fabio
    2016 INT IEEE CONFERENCES ON UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING AND COMMUNICATIONS, CLOUD AND BIG DATA COMPUTING, INTERNET OF PEOPLE, AND SMART WORLD CONGRESS (UIC/ATC/SCALCOM/CBDCOM/IOP/SMARTWORLD), 2016, : 559 - 566
  • [27] Towards multi-omics synthetic data integration
    Selvarajoo, Kumar
    Maurer-Stroh, Sebastian
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (03)
  • [28] Multi-omics approaches for biomarker discovery and precision diagnosis of prediabetes
    Song, Jielin
    Wang, Chuanfu
    Zhao, Tong
    Zhang, Yu
    Xing, Jixiang
    Zhao, Xuelian
    Zhang, Yunsha
    Zhang, Zhaohui
    FRONTIERS IN ENDOCRINOLOGY, 2025, 16
  • [29] Multi-omics profiling approaches to biomarker discovery in bipolar disorder
    Bahn, S.
    Alsaif, M.
    Rahmoune, H.
    EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2012, 22 : S120 - S120
  • [30] Identification of Osteoporosis Biomarkers and Biological Interactions Using Multi-omics Data Integration
    Liu, Anqi
    Jiang, Lindong
    Su, Kuan-Jui
    Zhang, Xiao
    Gong, Yun
    Qiu, Chuan
    Luo, Zhe
    Tian, Qing
    Ding, Zhengming
    Shen, Hui
    Deng, Hong-Wen
    JOURNAL OF BONE AND MINERAL RESEARCH, 2023, 38 : 152 - 153