Generalized Bayesian Factor Analysis for Integrative Clustering with Applications to Multi-Omics Data

被引:14
|
作者
Min, Eun Jeong [1 ]
Chang, Changgee [1 ]
Long, Qi [1 ]
机构
[1] Univ Penn, Dept Biostat Epidemiol & Informat, Philadelpia, PA 19104 USA
关键词
Generalized Bayesian Factor Analysis; Markov Random Field (MRF); Spike and Slab Lasso (SSL); Variational EM Algorithm; Structural Information; Network Information; Integrative Analysis; Integrative Clustering; High Dimensional Data; Omics Data; NCI60; LATENT VARIABLE MODEL; EXPRESSION PROFILES; SELECTION; MELANOGENESIS; INFORMATION; TRANSCRIPT; REGRESSION; DISCOVERY; INFERENCE; PATHWAYS;
D O I
10.1109/DSAA.2018.00021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Integrative clustering is a clustering approach for multiple datasets, which provide different views of a common group of subjects. It enables analyzing multi-omics data jointly to, for example, identify the subtypes of diseases, cells, and so on, capturing the complex underlying biological processes more precisely. On the other hand, there has been a great deal of interest in incorporating the prior structural knowledge on the features into statistical analyses over the past decade. The knowledge on the gene regulatory network (pathways) can potentially be incorporated into many genomic studies. In this paper, we propose a novel integrative clustering method which can incorporate the prior graph knowledge. We first develop a generalized Bayesian factor analysis (GBFA) framework, a sparse Bayesian factor analysis which can take into account the graph information. Our GBFA framework employs the spike and slab lasso (SSL) prior to impose sparsity on the factor loadings and the Markov random field (MRF) prior to encourage smoothing over the adjacent factor loadings, which establishes a unified shrinkage adaptive to the loading size and the graph structure. Then, we use the framework to extend iCluster+, a factor analysis based integrative clustering approach. A novel variational EM algorithm is proposed to efficiently estimate the MAP estimator for the factor loadings. Extensive simulation studies and the application to the NCI60 cell line dataset demonstrate that the propose method is superior and delivers more biologically meaningful outcomes.
引用
收藏
页码:109 / 119
页数:11
相关论文
共 50 条
  • [1] Integrative clustering methods for multi-omics data
    Zhang, Xiaoyu
    Zhou, Zhenwei
    Xu, Hanfei
    Liu, Ching-Ti
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2022, 14 (03)
  • [2] Evaluation of integrative clustering methods for the analysis of multi-omics data
    Chauvel, Cecile
    Novoloaca, Alexei
    Veyre, Pierre
    Reynier, Frederic
    Becker, Jeremie
    BRIEFINGS IN BIOINFORMATICS, 2020, 21 (02) : 541 - 552
  • [3] Bayesian integrative model for multi-omics data with missingness
    Fang, Zhou
    Ma, Tianzhou
    Tang, Gong
    Zhu, Li
    Yan, Qi
    Wang, Ting
    Celedon, Juan C.
    Chen, Wei
    Tseng, George C.
    BIOINFORMATICS, 2018, 34 (22) : 3801 - 3808
  • [4] Integrative analysis of multi-omics and imaging data with incorporation of biological information via structural Bayesian factor analysis
    Bao, Jingxuan
    Chang, Changgee
    Zhang, Qiyiwen
    Saykin, Andrew J.
    Shen, Li
    Long, Qi
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)
  • [5] Integrative analysis of multi-omics data for liquid biopsy
    Chen, Geng
    Zhang, Jing
    Fu, Qiaoting
    Taly, Valerie
    Tan, Fei
    BRITISH JOURNAL OF CANCER, 2023, 128 (04) : 702 - 702
  • [6] Comparative analysis of integrative classification methods for multi-omics data
    Novoloaca, Alexei
    Broc, Camilo
    Beloeil, Laurent
    Yu, Wen-Han
    Becker, Jeremie
    BRIEFINGS IN BIOINFORMATICS, 2024, 25 (04)
  • [7] Dimension reduction techniques for the integrative analysis of multi-omics data
    Meng, Chen
    Zeleznik, Oana A.
    Thallinger, Gerhard G.
    Kuster, Bernhard
    Gholami, Amin M.
    Culhane, Aedin C.
    BRIEFINGS IN BIOINFORMATICS, 2016, 17 (04) : 628 - 641
  • [8] Correction: Integrative analysis of multi-omics data for liquid biopsy
    Geng Chen
    Jing Zhang
    Qiaoting Fu
    Valerie Taly
    Fei Tan
    British Journal of Cancer, 2023, 128 : 702 - 702
  • [9] Sliced inverse regression for integrative multi-omics data analysis
    Jain, Yashita
    Ding, Shanshan
    Qiu, Jing
    STATISTICAL APPLICATIONS IN GENETICS AND MOLECULAR BIOLOGY, 2019, 18 (01)
  • [10] MODMatcher: Multi-Omics Data Matcher for Integrative Genomic Analysis
    Yoo, Seungyeul
    Huang, Tao
    Campbell, Joshua D.
    Lee, Eunjee
    Tu, Zhidong
    Geraci, Mark W.
    Powell, Charles A.
    Schadt, Eric E.
    Spira, Avrum
    Zhu, Jun
    PLOS COMPUTATIONAL BIOLOGY, 2014, 10 (08)