Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE)

被引：59

作者：

Ma, Tianle ^{[1
]}

Zhang, Aidong ^{[2
]}

机构：

[1] SUNY Buffalo, Dept Comp Sci & Engn, 338 Davis Hall, Buffalo, NY 14260 USA

[2] Univ Virginia, Dept Comp Sci, 509 Rice Hall, Charlottesville, VA 22904 USA

来源：

BMC GENOMICS | 2019年 / 20卷 / Suppl 11期

基金：

美国国家科学基金会;

关键词：

Multi-omics data; Biological interaction networks; Deep learning; Multi-view learning; Autoencoder; Data integration; Graph regularization;

D O I：

10.1186/s12864-019-6285-x

中图分类号：

Q81 [生物工程学（生物技术）]; Q93 [微生物学];

学科分类号：

071005 ; 0836 ; 090102 ; 100705 ;

摘要：

Background: Comprehensive molecular profiling of various cancers and other diseases has generated vast amounts of multi-omics data. Each type of -omics data corresponds to one feature space, such as gene expression, miRNA expression, DNA methylation, etc. Integrating multi-omics data can link different layers of molecular feature spaces and is crucial to elucidate molecular pathways underlying various diseases. Machine learning approaches to mining multi-omics data hold great promises in uncovering intricate relationships among molecular features. However, due to the "big p, small n" problem (i.e., small sample sizes with high-dimensional features), training a large-scale generalizable deep learning model with multi-omics data alone is very challenging. Results: We developed a method called Multi-view Factorization AutoEncoder (MAE) with network constraints that can seamlessly integrate multi-omics data and domain knowledge such as molecular interaction networks. Our method learns feature and patient embeddings simultaneously with deep representation learning. Both feature representations and patient representations are subject to certain constraints specified as regularization terms in the training objective. By incorporating domain knowledge into the training objective, we implicitly introduced a good inductive bias into the machine learning model, which helps improve model generalizability. We performed extensive experiments on the TCGA datasets and demonstrated the power of integrating multi-omics data and biological interaction networks using our proposed method for predicting target clinical variables. Conclusions: To alleviate the overfitting problem in deep learning on multi-omics data with the "big p, small n" problem, it is helpful to incorporate biological domain knowledge into the model as inductive biases. It is very promising to design machine learning models that facilitate the seamless integration of large-scale multi-omics data and biomedical domain knowledge for uncovering intricate relationships among molecular features and clinical features.

引用

页数：11

共 50 条

[1] Integrate multi-omics data with biological interaction networks using Multi-view Factorization AutoEncoder (MAE)
Tianle Ma
Aidong Zhang
BMC Genomics, 20
[2] Multi-view clustering for multi-omics data using unified embedding
Mitra, Sayantan
Saha, Sriparna
Hasanuzzaman, Mohammed
SCIENTIFIC REPORTS, 2020, 10 (01)
[3] Multi-view clustering for multi-omics data using unified embedding
Sayantan Mitra
Sriparna Saha
Mohammed Hasanuzzaman
Scientific Reports, 10
[4] Classifying breast cancer using multi-view graph neural network based on multi-omics data
Ren, Yanjiao
Gao, Yimeng
Du, Wei
Qiao, Weibo
Li, Wei
Yang, Qianqian
Liang, Yanchun
Li, Gaoyang
FRONTIERS IN GENETICS, 2024, 15
[5] Multi-view contrastive clustering for cancer subtyping using fully and weakly paired multi-omics data
Kuang, Yabin
Xie, Minzhu
Zhao, Zhanhong
Deng, Dongze
Bao, Ergude
METHODS, 2024, 232 : 1 - 8
[6] Inferring Interaction Networks From Multi-Omics Data
Hawe, Johann S.
Theis, Fabian J.
Heinig, Matthias
FRONTIERS IN GENETICS, 2019, 10
[7] A multi-view multi-omics model for cancer drug response prediction
Zhijin Wang
Ziyang Wang
Yaohui Huang
Longquan Lu
Yonggang Fu
Applied Intelligence, 2022, 52 : 14639 - 14650
[8] A multi-view multi-omics model for cancer drug response prediction
Wang, Zhijin
Wang, Ziyang
Huang, Yaohui
Lu, Longquan
Fu, Yonggang
APPLIED INTELLIGENCE, 2022, 52 (13) : 14639 - 14650
[9] Autoencoder-assisted latent representation learning for survival prediction and multi-view clustering on multi-omics cancer subtyping
Zhu, Shuwei
Wang, Wenping
Fang, Wei
Cui, Meiji
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (12) : 21098 - 21119
[10] Multi-view multi-level contrastive graph convolutional network for cancer subtyping on multi-omics data
Yang, Bo
Cui, Chenxi
Wang, Meng
Ji, Hong
Gao, Feiyue
BRIEFINGS IN BIOINFORMATICS, 2025, 26 (01)

← 1 2 3 4 5 →