DeepKEGG: a multi-omics data integration framework with biological insights for cancer recurrence prediction and biomarker discovery

被引:14
|
作者
Lan, Wei [1 ]
Liao, Haibo [2 ]
Chen, Qingfeng [3 ]
Zhu, Lingzhi [4 ]
Pan, Yi [5 ]
Chen, Yi-Ping Phoebe [6 ]
机构
[1] Guangxi Univ, Sch Comp Elect & Informat, Nanning, Peoples R China
[2] Guangxi Univ, Comp Technol, Nanning, Peoples R China
[3] Guangxi Univ, State Key Lab Conservat & Utilizat Subtrop Agrobio, Nanning, Peoples R China
[4] Hunan Inst Technol, Sch Comp & Informat Sci, Hengyang 421002, Peoples R China
[5] Chinese Acad Sci, Shenzhen Inst Adv Technol, Sch Comp Sci & Control Engn, Shenzhen, Peoples R China
[6] La Trobe Univ, Dept Comp Sci & Informat Technol, Bundoora, Vic, Australia
基金
中国国家自然科学基金;
关键词
cancer recurrence prediction; interpretability of deep learning; self-attention mechanism; multi-omics data integration; HEPATOCELLULAR-CARCINOMA; BLADDER-CANCER; SIGNALING PATHWAY; PROLIFERATION; ACTIVATION; SURVIVAL;
D O I
10.1093/bib/bbae185
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Deep learning-based multi-omics data integration methods have the capability to reveal the mechanisms of cancer development, discover cancer biomarkers and identify pathogenic targets. However, current methods ignore the potential correlations between samples in integrating multi-omics data. In addition, providing accurate biological explanations still poses significant challenges due to the complexity of deep learning models. Therefore, there is an urgent need for a deep learning-based multi-omics integration method to explore the potential correlations between samples and provide model interpretability. Herein, we propose a novel interpretable multi-omics data integration method (DeepKEGG) for cancer recurrence prediction and biomarker discovery. In DeepKEGG, a biological hierarchical module is designed for local connections of neuron nodes and model interpretability based on the biological relationship between genes/miRNAs and pathways. In addition, a pathway self-attention module is constructed to explore the correlation between different samples and generate the potential pathway feature representation for enhancing the prediction performance of the model. Lastly, an attribution-based feature importance calculation method is utilized to discover biomarkers related to cancer recurrence and provide a biological interpretation of the model. Experimental results demonstrate that DeepKEGG outperforms other state-of-the-art methods in 5-fold cross validation. Furthermore, case studies also indicate that DeepKEGG serves as an effective tool for biomarker discovery. The code is available at https://github.com/lanbiolab/DeepKEGG.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] Multi-omics data integration considerations and study design for biological systems and disease
    Graw, Stefan
    Chappell, Kevin
    Washam, Charity L.
    Gies, Allen
    Bird, Jordan
    Robeson, Michael S., II
    Byrum, Stephanie D.
    MOLECULAR OMICS, 2021, 17 (02) : 170 - 185
  • [32] Evaluation and comparison of multi-omics data integration methods for cancer subtyping
    Duan, Ran
    Gao, Lin
    Gao, Yong
    Hu, Yuxuan
    Xu, Han
    Huang, Mingfeng
    Song, Kuo
    Wang, Hongda
    Dong, Yongqiang
    Jiang, Chaoqun
    Zhang, Chenxing
    Jia, Songwei
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (08)
  • [33] Integration of multi-omics data and deep phenotyping enables prediction of cytokine responses
    Bakker, Olivier B.
    Aguirre-Gamboa, Raul
    Sanna, Serena
    Oosting, Marije
    Smeekens, Sanne P.
    Jaeger, Martin
    Zorro, Maria
    Vosa, Urmo
    Withoff, Sebo
    Netea-Maier, Romana T.
    Koenen, Hans J. P. M.
    Joosten, Irma
    Xavier, Ramnik J.
    Franke, Lude
    Joosten, Leo A. B.
    Kumar, Vinod
    Wijmenga, Cisca
    Netea, Mihai G.
    Li, Yang
    NATURE IMMUNOLOGY, 2018, 19 (07) : 776 - +
  • [34] Integration of multi-omics data for prediction of phenotypic traits using random forest
    Animesh Acharjee
    Bjorn Kloosterman
    Richard G. F. Visser
    Chris Maliepaard
    BMC Bioinformatics, 17
  • [35] Editorial: Integration of Multi-Omics Techniques in Cancer
    Andrieux, Geoffroy
    Chakraborty, Sajib
    FRONTIERS IN GENETICS, 2021, 12
  • [36] Improving survival prediction using flexible late fusion machine learning framework for multi-omics data integration
    Nikolaou, Nikos
    Salazar, Domingo
    RaviPrakash, Harish
    Goncalves, Miguel
    Argoty, Gustavo Alonso Arango
    Burlutsky, Nikolay
    Markuzon, Natasha
    Jacob, Etai
    CANCER RESEARCH, 2023, 83 (07)
  • [37] Integration of multi-omics data for prediction of phenotypic traits using random forest
    Acharjee, Animesh
    Kloosterman, Bjorn
    Visser, Richard G. F.
    Maliepaard, Chris
    BMC BIOINFORMATICS, 2016, 17
  • [38] Integration of multi-omics data and deep phenotyping enables prediction of cytokine responses
    Olivier B. Bakker
    Raul Aguirre-Gamboa
    Serena Sanna
    Marije Oosting
    Sanne P. Smeekens
    Martin Jaeger
    Maria Zorro
    Urmo Võsa
    Sebo Withoff
    Romana T. Netea-Maier
    Hans J. P. M. Koenen
    Irma Joosten
    Ramnik J. Xavier
    Lude Franke
    Leo A. B. Joosten
    Vinod Kumar
    Cisca Wijmenga
    Mihai G. Netea
    Yang Li
    Nature Immunology, 2018, 19 : 776 - 786
  • [39] Multi-omics integration for neuroblastoma clinical endpoint prediction
    Francescatto, Margherita
    Chierici, Marco
    Dezfooli, Setareh Rezvan
    Zandona, Alessandro
    Jurman, Giuseppe
    Furlanello, Cesare
    BIOLOGY DIRECT, 2018, 13
  • [40] GREMI: An Explainable Multi-Omics Integration Framework for Enhanced Disease Prediction and Module Identification
    Liang, Hong
    Luo, Haoran
    Sang, Zhiling
    Jia, Miao
    Jiang, Xiaohan
    Wang, Zheng
    Cong, Shan
    Yao, Xiaohui
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (11) : 6983 - 6996