Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms

被引:109
|
作者
Schulte-Sasse, Roman [1 ]
Budach, Stefan [1 ]
Hnisz, Denes [1 ]
Marsico, Annalisa [1 ,2 ]
机构
[1] Max Planck Inst Mol Genet, Berlin, Germany
[2] German Res Ctr Environm Hlth, Helmholtz Zentrum Munich, Inst Computat Biol, Munich, Germany
关键词
Convolution - Diseases - Alkylation - Learning systems - Backpropagation - Proteins;
D O I
10.1038/s42256-021-00325-y
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Identifying cancer driver genes from high-throughput genomic data is an important task to understand the molecular basis of cancer and to develop new treatments including precision medicine. To tackle this challenge, EMOGI, a new deep learning method based on graph convolutional networks is developed, which combines protein-protein interaction networks with multiomics datasets. The increase in available high-throughput molecular data creates computational challenges for the identification of cancer genes. Genetic as well as non-genetic causes contribute to tumorigenesis, and this necessitates the development of predictive models to effectively integrate different data modalities while being interpretable. We introduce EMOGI, an explainable machine learning method based on graph convolutional networks to predict cancer genes by combining multiomics pan-cancer data-such as mutations, copy number changes, DNA methylation and gene expression-together with protein-protein interaction (PPI) networks. EMOGI was on average more accurate than other methods across different PPI networks and datasets. We used layer-wise relevance propagation to stratify genes according to whether their classification was driven by the interactome or any of the omics levels, and to identify important modules in the PPI network. We propose 165 novel cancer genes that do not necessarily harbour recurrent alterations but interact with known cancer genes, and we show that they correspond to essential genes from loss-of-function screens. We believe that our method can open new avenues in precision oncology and be applied to predict biomarkers for other complex diseases.
引用
收藏
页码:513 / +
页数:16
相关论文
共 50 条
  • [21] Few-shot learning via graph embeddings with convolutional networks for low-data molecular property prediction
    Torres, Luis
    Arrais, Joel P.
    Ribeiro, Bernardete
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (18): : 13167 - 13185
  • [22] A scalable and integrated computational and experimental workflow to identify new driver genes in cancer genome data
    Horn, Heiko
    Lawrence, Michael S.
    Chouinard, Candace R.
    Shrestha, Yashaswi
    Hu, Jessica Xin
    Worstell, Elizabeth
    Shea, Emily
    Ilic, Nina
    Kim, Eejung
    Kamburov, Atanas
    Kashani, Alireza
    Hahn, William C.
    Campbell, Joshua D.
    Boehm, Jesse S.
    Getz, Gad
    Lage, Kasper
    CANCER RESEARCH, 2017, 77
  • [23] Molecular Subtyping of Cancer Based on Robust Graph Neural Network and Multi-Omics Data Integration
    Yin, Chaoyi
    Cao, Yangkun
    Sun, Peishuo
    Zhang, Hengyuan
    Li, Zhi
    Xu, Ying
    Sun, Huiyan
    FRONTIERS IN GENETICS, 2022, 13
  • [24] MaxMIF: A New Method for Identifying Cancer Driver Genes through Effective Data Integration
    Hou, Yingnan
    Gao, Bo
    Li, Guojun
    Su, Zhengchang
    ADVANCED SCIENCE, 2018, 5 (09)
  • [25] Uncovering the Subtype-Specific Molecular Characteristics of Breast Cancer by Multiomics Analysis of Prognosis-Associated Genes, Driver Genes, Signaling Pathways, and Immune Activity
    Li, Xinhui
    Zhou, Jian
    Xiao, Mingming
    Zhao, Lingyu
    Zhao, Yan
    Wang, Shuoshuo
    Gao, Shuangshu
    Zhuang, Yuan
    Niu, Yi
    Li, Shijun
    Li, Xiaobo
    Zhu, Yuanyuan
    Zhang, Minghui
    Tang, Jing
    FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2021, 9
  • [26] Integration of biological data by kernels on graph nodes allows prediction of new genes involved in mitotic chromosome condensation
    Heriche, Jean-Karim
    Lees, Jon G.
    Morilla, Ian
    Walter, Thomas
    Petrova, Boryana
    Roberti, M. Julia
    Hossain, M. Julius
    Adler, Priit
    Fernandez, Jose M.
    Krallinger, Martin
    Haering, Christian H.
    Vilo, Jaak
    Valencia, Alfonso
    Ranea, Juan A.
    Orengo, Christine
    Ellenberg, Jan
    MOLECULAR BIOLOGY OF THE CELL, 2014, 25 (16) : 2522 - 2536
  • [27] THE USE OF INTERFERONS TO RESCUE 2 NOVEL DEATH ASSOCIATED GENES AND TO IDENTIFY MOLECULAR MECHANISMS OF CELL-CYCLE ARREST
    KIMCHI, A
    MELAMED, D
    LEVY, N
    FEINSTEIN, L
    DEISS, L
    BERISSI, H
    RAVEH, T
    TIEFENBRUN, N
    COHEN, O
    JOURNAL OF CELLULAR BIOCHEMISTRY, 1994, : 166 - 166
  • [28] THE USE OF INTERFERONS TO RESCUE 2 NOVEL DEATH ASSOCIATED GENES AND TO IDENTIFY MOLECULAR MECHANISMS OF CELL-CYCLE ARREST
    KIMCHI, A
    MELAMED, D
    LEVY, N
    FEINSTEIN, L
    DEISS, L
    BERISSI, H
    RAVEH, T
    TIEFENBRUN, N
    COHEN, O
    JOURNAL OF CELLULAR BIOCHEMISTRY, 1994, : 212 - 212
  • [29] Integration of heterogeneous 'omics' data using semi-supervised network labelling to identify essential genes in colorectal cancer
    Chisanga, David
    Keerthikumar, Shivakumar
    Mathivanan, Suresh
    Chilamkurti, Naveen
    COMPUTERS & ELECTRICAL ENGINEERING, 2018, 67 : 267 - 277
  • [30] Transfer learning with molecular graph convolutional networks for accurate modeling and representation of bioactivities of ligands targeting GPCRs without sufficient data
    Wu, Jiansheng
    Lan, Chuangchuang
    Mei, Zheming
    Chen, Xiaohuyan
    Zhu, Yanxiang
    Hu, Haifeng
    Diao, Yemin
    COMPUTATIONAL BIOLOGY AND CHEMISTRY, 2022, 98