BAYESIAN SPARSE GRAPHICAL MODELS FOR CLASSIFICATION WITH APPLICATION TO PROTEIN EXPRESSION DATA

被引:18
作者
Baladandayuthapani, Veerabhadran [1 ]
Talluri, Rajesh [1 ]
Ji, Yuan [2 ]
Coombes, Kevin R. [3 ]
Lu, Yiling [4 ]
Hennessy, Bryan T. [5 ]
Davies, Michael A. [6 ]
Mallick, Bani K. [7 ]
机构
[1] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
[2] Northshore Univ HealthSyst, Evanston, IL 60201 USA
[3] Ohio State Univ, Wexner Med Ctr, Dept Biomed Informat, Columbus, OH USA
[4] Univ Texas MD Anderson Canc Ctr, Dept Syst Biol, Houston, TX 77030 USA
[5] Beaumont Hosp, Dublin 9, Ireland
[6] Univ Texas MD Anderson Canc Ctr, Dept Melanoma Med Oncol, Houston, TX 77030 USA
[7] Texas A&M Univ, Dept Stat, College Stn, TX 77843 USA
基金
爱尔兰科学基金会;
关键词
Bayesian methods; protein signaling pathways; graphical models; mixture models; PROTEOMIC ANALYSIS; GENE-EXPRESSION; COVARIANCE ESTIMATION; PI3K PATHWAY; CANCER; MUTATIONS; SELECTION; CELLS; ARRAY; ACTIVATION;
D O I
10.1214/14-AOAS722
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Reverse-phase protein array (RPPA) analysis is a powerful, relatively new platform that allows for high-throughput, quantitative analysis of protein networks. One of the challenges that currently limit the potential of this technology is the lack of methods that allow for accurate data modeling and identification of related networks and samples. Such models may improve the accuracy of biological sample classification based on patterns of protein network activation and provide insight into the distinct biological relationships underlying different types of cancer. Motivated by RPPA data, we propose a Bayesian sparse graphical modeling approach that uses selection priors on the conditional relationships in the presence of class information. The novelty of our Bayesian model lies in the ability to draw information from the network data as well as from the associated categorical outcome in a unified hierarchical model for classification. In addition, our method allows for intuitive integration of a priori network information directly in the model and allows for posterior inference on the network topologies both within and between classes. Applying our methodology to an RPPA data set generated from panels of human breast cancer and ovarian cancer cell lines, we demonstrate that the model is able to distinguish the different cancer cell types more accurately than several existing models and to identify differential regulation of components of a critical signaling network (the PI3K-AKT pathway) between these two types of cancer. This approach represents a powerful new tool that can be used to improve our understanding of protein networks in cancer.
引用
收藏
页码:1443 / 1468
页数:26
相关论文
共 62 条
[1]  
[Anonymous], 1990, GRAPHICAL MODELS APP
[2]  
Barnard J, 2000, STAT SINICA, V10, P1281
[3]   The biology of ovarian cancer: new opportunities for translation [J].
Bast, Robert C., Jr. ;
Hennessy, Bryan ;
Mills, Gordon B. .
NATURE REVIEWS CANCER, 2009, 9 (06) :415-428
[4]   Regularized estimation of large covariance matrices [J].
Bickel, Peter J. ;
Levina, Elizaveta .
ANNALS OF STATISTICS, 2008, 36 (01) :199-227
[5]   MicroRNA expression profiles for the NCI-60 cancer cell panel [J].
Blower, Paul E. ;
Verducci, Joseph S. ;
Lin, Shili ;
Zhou, Jin ;
Chung, Ji-Hyun ;
Dai, Zunyan ;
Liu, Chang-Gong ;
Reinhold, William ;
Lorenzi, Philip L. ;
Kaldjian, Eric P. ;
Croce, Carlo M. ;
Weinstein, John N. ;
Sadee, Wolfgang .
MOLECULAR CANCER THERAPEUTICS, 2007, 6 (05) :1483-1491
[6]   Objective Bayesian model selection in Gaussian graphical models [J].
Carvalho, C. M. ;
Scott, J. G. .
BIOMETRIKA, 2009, 96 (03) :497-512
[7]   Estimation of a covariance matrix with zeros [J].
Chaudhuri, Sanjay ;
Drton, Mathias ;
Richardson, Thomas S. .
BIOMETRIKA, 2007, 94 (01) :199-216
[8]   The PI3K Pathway As Drug Target in Human Cancer [J].
Courtney, Kevin D. ;
Corcoran, Ryan B. ;
Engelman, Jeffrey A. .
JOURNAL OF CLINICAL ONCOLOGY, 2010, 28 (06) :1075-1083
[9]   On some models for multivariate binary variables parallel in complexity with the multivariate Gaussian distribution [J].
Cox, DR ;
Wermuth, N .
BIOMETRIKA, 2002, 89 (02) :462-469
[10]  
Davies MA, 1999, CANCER RES, V59, P2551