DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery

被引:14
|
作者
Lan, Wei [1 ]
Liao, Haibo [2 ]
Chen, Qingfeng [3 ]
Zhu, Lingzhi [4 ]
Pan, Yi [5 ]
Chen, Yi-Ping Phoebe [6 ]
机构
[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning, Peoples R China
[2] Guangxi Univ, Comp Technol, Nanning, Peoples R China
[3] Guangxi Univ, State Key Lab Conservat & Utilizat Subtrop Agrobio, Nanning, Peoples R China
[4] Hunan Inst Technol, Sch Comp & Informat Sci, Hengyang 421002, Peoples R China
[5] Chinese Acad Sci, Shenzhen Inst Adv Technol, Sch Comp Sci & Control Engn, Shenzhen, Peoples R China
[6] La Trobe Univ, Dept Comp Sci & Informat Technol, Bundoora, Vic, Australia
基金
中国国家自然科学基金;
关键词
cancer recurrence prediction; interpretability of deep learning; self-attention mechanism; multi-omics data integration; HEPATOCELLULAR-CARCINOMA; BLADDER-CANCER; SIGNALING PATHWAY; PROLIFERATION; ACTIVATION; SURVIVAL;
D O I
10.1093/bib/bbae185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Multi-omics Data Integration and Network Inference for Biomarker Discovery in Glioma
    Coletti, Roberta
    Lopes, Marta B.
    PROGRESS IN ARTIFICIAL INTELLIGENCE, EPIA 2023, PT II, 2023, 14116 : 247 - 259
  • [2] A multi-omics approach for kidney cancer biomarker discovery
    Zieren, R. C.
    Clark, D. J.
    Dong, L.
    Moreno, L. F.
    Kuczler, M. D.
    Amend, S. R.
    De Reijke, T. M.
    Pienta, K. J.
    EUROPEAN UROLOGY, 2021, 79 : S741 - S741
  • [3] Breast Cancer Recurrence Risk Predictor Using a Deep Learning Multi-omics Data Integration Framework
    Rahman, Ariana
    Zhang, Yining
    Park, Jin G.
    2023 IEEE 36TH INTERNATIONAL SYMPOSIUM ON COMPUTER-BASED MEDICAL SYSTEMS, CBMS, 2023, : 921 - 922
  • [4] A denoised multi-omics integration framework for cancer subtype classification and survival prediction
    Pang, Jiali
    Liang, Bilin
    Ding, Ruifeng
    Yan, Qiujuan
    Chen, Ruiyao
    Xu, Jie
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (05)
  • [5] Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets
    Argelaguet, Ricard
    Velten, Britta
    Arnol, Damien
    Dietrich, Sascha
    Zenz, Thorsten
    Marioni, John C.
    Buettner, Florian
    Huber, Wolfgang
    Stegle, Oliver
    MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (06)
  • [6] Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
    Bin Baek
    Hyunju Lee
    Scientific Reports, 10
  • [7] Prediction of survival and recurrence in patients with pancreatic cancer by integrating multi-omics data
    Baek, Bin
    Lee, Hyunju
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [8] Integration of Multi-Omics Data to Identify Cancer Biomarkers
    Li, Peng
    Sun, Bo
    JOURNAL OF INFORMATION TECHNOLOGY RESEARCH, 2022, 15 (01)
  • [9] Machine learning for multi-omics data integration in cancer
    Cai, Zhaoxiang
    Poulos, Rebecca C.
    Liu, Jia
    Zhong, Qing
    ISCIENCE, 2022, 25 (02)
  • [10] A multi-omics approach for biomarker discovery in neuroblastoma: a network-based framework
    Hussein, Rahma
    Abou-Shanab, Ahmed M.
    Badr, Eman
    NPJ SYSTEMS BIOLOGY AND APPLICATIONS, 2024, 10 (01)