Bayesian shrinkage models for integration and analysis of multiplatform high-dimensional genomics data

被引:1
|
作者
Xue, Hao [1 ]
Chakraborty, Sounak [2 ,4 ]
Dey, Tanujit [3 ]
机构
[1] Cornell Univ, Dept Computat Biol, Ithaca, NY USA
[2] Univ Missouri, Dept Stat, Columbia, MO USA
[3] Harvard Med Sch, Brigham & Womens Hosp, Ctr Surg & Publ Hlth, Dept Surg, Boston, MA USA
[4] Univ Missouri, Dept Stat, C209F Middlebush Hall, Columbia, MO 65211 USA
关键词
data integration; Expectation Maximization; glioblastoma; hierarchical Bayesian model; multiomics; VARIABLE SELECTION; DNA METHYLATION; PENALIZED LIKELIHOOD; GLIOBLASTOMA; EXPRESSION; INTERLEUKIN-8;
D O I
10.1002/sam.11682
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the increasing availability of biomedical data from multiple platforms of the same patients in clinical research, such as epigenomics, gene expression, and clinical features, there is a growing need for statistical methods that can jointly analyze data from different platforms to provide complementary information for clinical studies. In this paper, we propose a two-stage hierarchical Bayesian model that integrates high-dimensional biomedical data from diverse platforms to select biomarkers associated with clinical outcomes of interest. In the first stage, we use Expectation Maximization-based approach to learn the regulating mechanism between epigenomics (e.g., gene methylation) and gene expression while considering functional gene annotations. In the second stage, we group genes based on the regulating mechanism learned in the first stage. Then, we apply a group-wise penalty to select genes significantly associated with clinical outcomes while incorporating clinical features. Simulation studies suggest that our model-based data integration method shows lower false positives in selecting predictive variables compared with existing method. Moreover, real data analysis based on a glioblastoma (GBM) dataset reveals our method's potential to detect genes associated with GBM survival with higher accuracy than the existing method. Moreover, most of the selected biomarkers are crucial in GBM prognosis as confirmed by existing literature.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data
    Wang, Wenting
    Baladandayuthapani, Veerabhadran
    Morris, Jeffrey S.
    Broom, Bradley M.
    Manyam, Ganiraju
    Do, Kim-Anh
    BIOINFORMATICS, 2013, 29 (02) : 149 - 159
  • [2] BayesMultiomics: An R Package for Bayesian Shrinkage Models for Integration and Analysis of Multi-Platform High-Dimensional Genomics Data
    Cho, Mansoo
    Dey, Tanujit
    Xue, Hao
    Chakraborty, Sounak
    STATISTICAL ANALYSIS AND DATA MINING, 2025, 18 (01)
  • [3] Functional Integrative Bayesian Analysis of High-Dimensional Multiplatform Clinicogenomic Data
    Bhattacharyya, Rupam
    Henderson, Nicholas C.
    Baladandayuthapani, Veerabhadran
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2024,
  • [4] Flexible shrinkage in high-dimensional Bayesian spatial autoregressive models
    Pfarrhofer, Michael
    Piribauer, Philipp
    SPATIAL STATISTICS, 2019, 29 : 109 - 128
  • [5] High-Dimensional Data in Genomics
    Amaratunga, Dhammika
    Cabrera, Javier
    BIOPHARMACEUTICAL APPLIED STATISTICS SYMPOSIUM, VOL 3: PHARMACEUTICAL APPLICATIONS, 2018, : 65 - 73
  • [6] Bayesian weighted random forest for classification of high-dimensional genomics data
    Olaniran, Oyebayo Ridwan
    Abdullah, Mohd Asrul A.
    KUWAIT JOURNAL OF SCIENCE, 2023, 50 (04) : 477 - 484
  • [7] Nearly optimal Bayesian shrinkage for high-dimensional regression
    Song, Qifan
    Liang, Faming
    SCIENCE CHINA-MATHEMATICS, 2023, 66 (02) : 409 - 442
  • [8] Nearly optimal Bayesian shrinkage for high-dimensional regression
    Qifan Song
    Faming Liang
    Science China(Mathematics), 2023, 66 (02) : 409 - 442
  • [9] Nearly optimal Bayesian shrinkage for high-dimensional regression
    Qifan Song
    Faming Liang
    Science China Mathematics, 2023, 66 : 409 - 442
  • [10] Supervised Bayesian latent class models for high-dimensional data
    Desantis, Stacia M.
    Houseman, E. Andres
    Coull, Brent A.
    Nutt, Catherine L.
    Betensky, Rebecca A.
    STATISTICS IN MEDICINE, 2012, 31 (13) : 1342 - 1360