Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE)

被引:59
|
作者
Ma, Tianle [1 ]
Zhang, Aidong [2 ]
机构
[1] SUNY Buffalo, Dept Comp Sci & Engn, 338 Davis Hall, Buffalo, NY 14260 USA
[2] Univ Virginia, Dept Comp Sci, 509 Rice Hall, Charlottesville, VA 22904 USA
基金
美国国家科学基金会;
关键词
Multi-omics data; Biological interaction networks; Deep learning; Multi-view learning; Autoencoder; Data integration; Graph regularization;
D O I
10.1186/s12864-019-6285-x
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Comprehensive molecular profiling of various cancers and other diseases has generated vast amounts of multi-omics data. Each type of -omics data corresponds to one feature space, such as gene expression, miRNA expression, DNA methylation, etc. Integrating multi-omics data can link different layers of molecular feature spaces and is crucial to elucidate molecular pathways underlying various diseases. Machine learning approaches to mining multi-omics data hold great promises in uncovering intricate relationships among molecular features. However, due to the "big p, small n" problem (i.e., small sample sizes with high-dimensional features), training a large-scale generalizable deep learning model with multi-omics data alone is very challenging. Results: We developed a method called Multi-view Factorization AutoEncoder (MAE) with network constraints that can seamlessly integrate multi-omics data and domain knowledge such as molecular interaction networks. Our method learns feature and patient embeddings simultaneously with deep representation learning. Both feature representations and patient representations are subject to certain constraints specified as regularization terms in the training objective. By incorporating domain knowledge into the training objective, we implicitly introduced a good inductive bias into the machine learning model, which helps improve model generalizability. We performed extensive experiments on the TCGA datasets and demonstrated the power of integrating multi-omics data and biological interaction networks using our proposed method for predicting target clinical variables. Conclusions: To alleviate the overfitting problem in deep learning on multi-omics data with the "big p, small n" problem, it is helpful to incorporate biological domain knowledge into the model as inductive biases. It is very promising to design machine learning models that facilitate the seamless integration of large-scale multi-omics data and biomedical domain knowledge for uncovering intricate relationships among molecular features and clinical features.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE)
    Tianle Ma
    Aidong Zhang
    BMC Genomics, 20
  • [2] Multi-view clustering for multi-omics data using unified embedding
    Mitra, Sayantan
    Saha, Sriparna
    Hasanuzzaman, Mohammed
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [3] Multi-view clustering for multi-omics data using unified embedding
    Sayantan Mitra
    Sriparna Saha
    Mohammed Hasanuzzaman
    Scientific Reports, 10
  • [4] Classifying breast cancer using multi-view graph neural network based on multi-omics data
    Ren, Yanjiao
    Gao, Yimeng
    Du, Wei
    Qiao, Weibo
    Li, Wei
    Yang, Qianqian
    Liang, Yanchun
    Li, Gaoyang
    FRONTIERS IN GENETICS, 2024, 15
  • [5] Multi-view contrastive clustering for cancer subtyping using fully and weakly paired multi-omics data
    Kuang, Yabin
    Xie, Minzhu
    Zhao, Zhanhong
    Deng, Dongze
    Bao, Ergude
    METHODS, 2024, 232 : 1 - 8
  • [6] Inferring Interaction Networks From Multi-Omics Data
    Hawe, Johann S.
    Theis, Fabian J.
    Heinig, Matthias
    FRONTIERS IN GENETICS, 2019, 10
  • [7] A multi-view multi-omics model for cancer drug response prediction
    Zhijin Wang
    Ziyang Wang
    Yaohui Huang
    Longquan Lu
    Yonggang Fu
    Applied Intelligence, 2022, 52 : 14639 - 14650
  • [8] A multi-view multi-omics model for cancer drug response prediction
    Wang, Zhijin
    Wang, Ziyang
    Huang, Yaohui
    Lu, Longquan
    Fu, Yonggang
    APPLIED INTELLIGENCE, 2022, 52 (13) : 14639 - 14650
  • [9] Autoencoder-assisted latent representation learning for survival prediction and multi-view clustering on multi-omics cancer subtyping
    Zhu, Shuwei
    Wang, Wenping
    Fang, Wei
    Cui, Meiji
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (12) : 21098 - 21119
  • [10] Multi-view multi-level contrastive graph convolutional network for cancer subtyping on multi-omics data
    Yang, Bo
    Cui, Chenxi
    Wang, Meng
    Ji, Hong
    Gao, Feiyue
    BRIEFINGS IN BIOINFORMATICS, 2025, 26 (01)