Graph-Guided Bayesian Factor Model for Integrative Analysis of Multi-modal Data with Noisy Network Information

被引:0
|
作者
Li, Wenrui [1 ]
Zhang, Qiyiwen [2 ]
Qu, Kewen [3 ]
Long, Qi [3 ]
机构
[1] Univ Connecticut, Dept Stat, 215 Glenbrook Rd, Storrs, CT 06269 USA
[2] Univ Pittsburgh, Dept Med, 3550 Terrace St, Pittsburgh, PA 15261 USA
[3] Univ Penn, Dept Biostat Epidemiol & Informat, 423 Guardian Dr, Philadelphia, PA 19104 USA
关键词
Bayesian shrinkage; Factor analysis; Latent scale network model; MCMC algorithm; Noisy graph; INVERSE COVARIANCE ESTIMATION; VARIABLE SELECTION; GENES; JOINT;
D O I
10.1007/s12561-024-09452-7
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
There is a growing body of literature on factor analysis that can capture individual and shared structures in multi-modal data. However, few of these approaches incorporate biological knowledge such as functional genomics and functional metabolomics. Graph-guided statistical learning methods that can incorporate knowledge of underlying networks have been shown to improve predication and classification accuracy, and yield more interpretable results. Moreover, these methods typically use graphs extracted from existing databases or rely on subject matter expertise which are known to be incomplete and may contain false edges. To address this gap, we propose a graph-guided Bayesian factor model that can account for network noise and identify globally shared, partially shared and modality-specific latent factors in multi-modal data. Specifically, we use two sources of network information, including the noisy graph extracted from existing databases and the estimated graph from observed features in the dataset at hand, to inform the model for the true underlying network via a latent scale modeling framework. This model is coupled with the Bayesian factor analysis model with shrinkage priors to encourage feature-wise and modal-wise sparsity, thereby allowing feature selection and identification of factors of each type. We develop an efficient Markov chain Monte Carlo algorithm for posterior sampling. We demonstrate the advantages of our method over existing methods in simulations, and through analyses of gene expression and metabolomics datasets for Alzheimer's disease.
引用
收藏
页数:17
相关论文
共 3 条
  • [1] Accounting for network noise in graph-guided Bayesian modeling of structured high-dimensional data
    Li, Wenrui
    Chang, Changgee
    Kundu, Suprateek
    Long, Qi
    BIOMETRICS, 2024, 80 (01)
  • [2] Integrative analysis of multi-omics and imaging data with incorporation of biological information via structural Bayesian factor analysis
    Bao, Jingxuan
    Chang, Changgee
    Zhang, Qiyiwen
    Saykin, Andrew J.
    Shen, Li
    Long, Qi
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)
  • [3] Bayesian Vector Autoregressive Model for Multi-Subject Effective Connectivity Inference Using Multi-Modal Neuroimaging Data
    Chiang, Sharon
    Guindani, Michele
    Yeh, Hsiang J.
    Haneef, Zulfi
    Stern, John M.
    Vannucci, Marina
    HUMAN BRAIN MAPPING, 2017, 38 (03) : 1311 - 1332