Generalized Bayesian Factor Analysis for Integrative Clustering with Applications to Multi-Omics Data

被引:14
|
作者
Min, Eun Jeong [1 ]
Chang, Changgee [1 ]
Long, Qi [1 ]
机构
[1] Univ Penn, Dept Biostat Epidemiol & Informat, Philadelpia, PA 19104 USA
关键词
Generalized Bayesian Factor Analysis; Markov Random Field (MRF); Spike and Slab Lasso (SSL); Variational EM Algorithm; Structural Information; Network Information; Integrative Analysis; Integrative Clustering; High Dimensional Data; Omics Data; NCI60; LATENT VARIABLE MODEL; EXPRESSION PROFILES; SELECTION; MELANOGENESIS; INFORMATION; TRANSCRIPT; REGRESSION; DISCOVERY; INFERENCE; PATHWAYS;
D O I
10.1109/DSAA.2018.00021
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Integrative clustering is a clustering approach for multiple datasets, which provide different views of a common group of subjects. It enables analyzing multi-omics data jointly to, for example, identify the subtypes of diseases, cells, and so on, capturing the complex underlying biological processes more precisely. On the other hand, there has been a great deal of interest in incorporating the prior structural knowledge on the features into statistical analyses over the past decade. The knowledge on the gene regulatory network (pathways) can potentially be incorporated into many genomic studies. In this paper, we propose a novel integrative clustering method which can incorporate the prior graph knowledge. We first develop a generalized Bayesian factor analysis (GBFA) framework, a sparse Bayesian factor analysis which can take into account the graph information. Our GBFA framework employs the spike and slab lasso (SSL) prior to impose sparsity on the factor loadings and the Markov random field (MRF) prior to encourage smoothing over the adjacent factor loadings, which establishes a unified shrinkage adaptive to the loading size and the graph structure. Then, we use the framework to extend iCluster+, a factor analysis based integrative clustering approach. A novel variational EM algorithm is proposed to efficiently estimate the MAP estimator for the factor loadings. Extensive simulation studies and the application to the NCI60 cell line dataset demonstrate that the propose method is superior and delivers more biologically meaningful outcomes.
引用
收藏
页码:109 / 119
页数:11
相关论文
共 50 条
  • [31] Spectral clustering of weighted variables on multi-omics data
    Lee, Yunjung
    Park, Seyoung
    KOREAN JOURNAL OF APPLIED STATISTICS, 2023, 36 (03) : 175 - 196
  • [32] Integrative multi-omics and big data analysis of global nutrition and radiotherapy trends
    Meng, Sibo
    Jiang, Dizhi
    Yang, Guanghui
    Guo, Kaiyue
    Yu, Enhao
    Wang, Yun
    Qu, Linli
    Li, Jiaxin
    INTERNATIONAL JOURNAL OF BIOCHEMISTRY & CELL BIOLOGY, 2024, 177
  • [33] Integrative analysis of single-cell multi-omics data of the human retina
    Liang, Qingnan
    Cheng, Xuesen
    Owen, Leah
    Shakoor, Akbar
    Vitale, Albert T.
    Husami, Nadine
    Morgan, Denise
    Farkas, Michael H.
    Kim, Ivana K.
    Li, Yumei
    DeAngelis, Margaret M.
    Chen, Rui
    INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2021, 62 (08)
  • [34] Integrative Multi-omics Analysis of Childhood Aggressive Behavior
    Hagenbeek, Fiona A.
    van Dongen, Jenny
    Pool, Rene
    Roetman, Peter J.
    Harms, Amy C.
    Hottenga, Jouke Jan
    Kluft, Cornelis
    Colins, Olivier F.
    van Beijsterveldt, Catharina E. M.
    Fanos, Vassilios
    Ehli, Erik A.
    Hankemeier, Thomas
    Vermeiren, Robert R. J. M.
    Bartels, Meike
    Dejean, Sebastien
    Boomsma, Dorret, I
    BEHAVIOR GENETICS, 2023, 53 (02) : 101 - 117
  • [35] MinOmics, an Integrative and Immersive Tool for Multi-Omics Analysis
    Maes, Alexandre
    Martinez, Xavier
    Druart, Karen
    Laurent, Benoist
    Guegan, Sean
    Marchand, Christophe H.
    Lemaire, Stephane D.
    Baaden, Marc
    JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2018, 15 (02)
  • [36] Integrative Multi-omics Analysis of Childhood Aggressive Behavior
    Fiona A. Hagenbeek
    Jenny van Dongen
    René Pool
    Peter J. Roetman
    Amy C. Harms
    Jouke Jan Hottenga
    Cornelis Kluft
    Olivier F. Colins
    Catharina E. M. van Beijsterveldt
    Vassilios Fanos
    Erik A. Ehli
    Thomas Hankemeier
    Robert R. J. M. Vermeiren
    Meike Bartels
    Sébastien Déjean
    Dorret I. Boomsma
    Behavior Genetics, 2023, 53 : 101 - 117
  • [37] Integrative Clustering Analysis for Omics Data with Missingness
    Zhao, Yinqi
    Darst, Burcu
    Conti, David V.
    GENETIC EPIDEMIOLOGY, 2021, 45 (07) : 806 - 806
  • [38] LUCID: An Integrative Clustering Model for Multi Omics Data
    Zhao, Yinqi
    Conti, David V.
    GENETIC EPIDEMIOLOGY, 2022, 46 (07) : 550 - 550
  • [39] Integrative multi-omics analysis of intestinal organoid differentiation
    Lindeboom, Rik G. H.
    van Voorthuijsen, Lisa
    Oost, Koen C.
    Rodriguez-Colman, Maria J.
    Luna-Velez, Maria V.
    Furlan, Cristina
    Baraille, Floriane
    Jansen, Pascal W. T. C.
    Ribeiro, Agnes
    Burgering, Boudewijn M. T.
    Snippert, Hugo J.
    Vermeulen, Michiel
    MOLECULAR SYSTEMS BIOLOGY, 2018, 14 (06)
  • [40] Visual analysis of multi-omics data
    Swart, Austin
    Caspi, Ron
    Paley, Suzanne
    Karp, Peter D.
    FRONTIERS IN BIOINFORMATICS, 2024, 4