A Bayesian Nonparametric Model for Integrative Clustering of Omics Data

被引:0
作者
Peneva, Iliana [1 ]
Savage, Richard S. [2 ]
机构
[1] Univ Warwick, Warwick, England
[2] Univ Warwick, Dept Stat, Warwick, England
来源
BAYESIAN STATISTICS AND NEW GENERATIONS, BAYSM 2018 | 2019年 / 296卷
关键词
Bayesian nonparametrics; Data integration; Glioblastoma; Mixture models; Non-local priors; LATENT VARIABLE MODEL; BREAST; GLIOBLASTOMA;
D O I
10.1007/978-3-030-30611-3_11
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Cancer is a complex disease, driven by a range of genetic and environmental factors. Many integrative clustering methods aim to provide insight into the mechanisms underlying cancer but fewof them are computationally efficient and able to estimate the number of subtypes. We have developed a Bayesian nonparametric model for combined data integration and clustering called BayesCluster, which aims to identify cancer subtypes and addresses many of the issues faced by the existing integrative methods. The proposed method can integrate and use the information from multiple different datasets, and offers better cluster interpretability by using nonlocal priors. We incorporate feature learning because of the large number of predictors, and use a Dirichlet process mixture model approach to produce the patient subgroups. We ensure tractable inference with simulated annealing. We apply the model to datasets from the Cancer Genome Atlas project of glioblastoma multiforme, which contains clinical and biological data about cancer patients with extremely poor prognosis of survival. By combining all available information we are able to be better identify clinically meaningful subtypes of glioblastoma.
引用
收藏
页码:105 / 114
页数:10
相关论文
共 44 条
[1]   Context-specific Bayesian clustering for gene expression data [J].
Barash, Y ;
Friedman, N .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2002, 9 (02) :169-191
[2]  
BISHOP C. M., 2006, Pattern recognition and machine learning, DOI [DOI 10.1117/1.2819119, 10.1007/978-0-387-45528-0]
[3]   Variational Inference: A Review for Statisticians [J].
Blei, David M. ;
Kucukelbir, Alp ;
McAuliffe, Jon D. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877
[4]   K-modes clustering [J].
Chaturvedi, A ;
Green, PE ;
Carroll, JD .
JOURNAL OF CLASSIFICATION, 2001, 18 (01) :35-55
[5]   The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups [J].
Curtis, Christina ;
Shah, Sohrab P. ;
Chin, Suet-Feung ;
Turashvili, Gulisa ;
Rueda, Oscar M. ;
Dunning, Mark J. ;
Speed, Doug ;
Lynch, Andy G. ;
Samarajiwa, Shamith ;
Yuan, Yinyin ;
Graef, Stefan ;
Ha, Gavin ;
Haffari, Gholamreza ;
Bashashati, Ali ;
Russell, Roslin ;
McKinney, Steven ;
Langerod, Anita ;
Green, Andrew ;
Provenzano, Elena ;
Wishart, Gordon ;
Pinder, Sarah ;
Watson, Peter ;
Markowetz, Florian ;
Murphy, Leigh ;
Ellis, Ian ;
Purushotham, Arnie ;
Borresen-Dale, Anne-Lise ;
Brenton, James D. ;
Tavare, Simon ;
Caldas, Carlos ;
Aparicio, Samuel .
NATURE, 2012, 486 (7403) :346-352
[6]   Has the survival of patients with glioblastoma changed over the years? [J].
deSouza, R. M. ;
Shaweis, H. ;
Han, C. ;
Sivasubramiam, V. ;
Brazil, L. ;
Beaney, R. ;
Sadler, G. ;
Al-Sarraj, S. ;
Hampton, T. ;
Logan, J. ;
Hurwitz, V. ;
Bhangoo, R. ;
Gullan, R. ;
Ashkan, K. .
BRITISH JOURNAL OF CANCER, 2016, 114 (02) :146-150
[7]  
Filkov V, 2004, LECT N BIOINFORMAT, V2994, P110
[8]   Regularization Paths for Generalized Linear Models via Coordinate Descent [J].
Friedman, Jerome ;
Hastie, Trevor ;
Tibshirani, Rob .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (01) :1-22
[9]   On choosing mixture components via non-local priors [J].
Fuquene, Jairo ;
Steel, Mark ;
Rossell, David .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2019, 81 (05) :809-837
[10]  
Görür D, 2010, J COMPUT SCI TECH-CH, V25, P653, DOI [10.1007/s11390-010-9355-8, 10.1007/s11390-010-1051-1]