Gaussian process selections in semiparametric multi-kernel machine regression for multi-pathway analysis

被引:0
作者
Lin, Jiali [1 ]
Kim, Inyoung [1 ]
机构
[1] Virginia Polytech Inst & State Univ, Dept Stat, 410A Hutcheson Hall, Blacksburg, VA 24061 USA
关键词
Gaussian process; Ising prior; kernel learning; pathway analysis; variable selection; variational Bayesian; BAYESIAN VARIABLE SELECTION; OXIDATIVE-PHOSPHORYLATION; VARIATIONAL INFERENCE; GLOBAL TEST; GENES; EXPRESSION; MODEL; LASSO;
D O I
10.1002/sam.11699
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Analyzing correlated high-dimensional data is a challenging problem in genomics, proteomics, and other related areas. For example, it is important to identify significant genetic pathway effects associated with biomarkers in which a gene pathway is a set of genes that functionally works together to regulate a certain biological process. A pathway-based analysis can detect a subtle change in expression level that cannot be found using a gene-based analysis. Here, we refer to pathway as a set and gene as an element in a set. However, it is challenging to select automatically which pathways are highly associated to the outcome when there are multiple pathways. In this paper, we propose a semiparametric multikernel regression model to study the effects of fixed covariates (e.g., clinical variables) and sets of elements (e.g., pathways of genes) to address a problem of detecting signal sets associated to biomarkers. We model the unknown high-dimension functions of multi-sets via multiple Gaussian kernel machines to consider the possibility that elements within the same set interact with each other. Hence, our variable set selection can be considered a Gaussian process set selection. We develop our Gaussian process set selection under the Bayesian variance component-selection framework. We incorporate prior knowledge for structural sets by imposing an Ising prior on the model. Our approach can be easily applied in high-dimensional spaces where the sample size is smaller than the number of variables. An efficient variational Bayes algorithm is developed. We demonstrate the advantages of our approach through simulation studies and through a type II diabetes genetic-pathway analysis.
引用
收藏
页数:20
相关论文
共 49 条
[1]   Urinary Metabolomic Profiling to Identify Potential Biomarkers for the Diagnosis of Behcet's Disease by Gas Chromatography/Time-of-Flight-Mass Spectrometry [J].
Ahn, Joong Kyong ;
Kim, Jungyeon ;
Hwang, Jiwon ;
Song, Juhwan ;
Kim, Kyoung Heon ;
Cha, Hoon-Suk .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2017, 18 (11)
[2]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[3]   Variational Inference: A Review for Statisticians [J].
Blei, David M. ;
Kucukelbir, Alp ;
McAuliffe, Jon D. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2017, 112 (518) :859-877
[4]   Kernel Machine Approach to Testing the Significance of Multiple Genetic Markers for Risk Prediction [J].
Cai, Tianxi ;
Tonini, Giulia ;
Lin, Xihong .
BIOMETRICS, 2011, 67 (03) :975-986
[5]   Scalable Variational Inference for Bayesian Variable Selection in Regression, and Its Accuracy in Genetic Association Studies [J].
Carbonetto, Peter ;
Stephens, Matthew .
BAYESIAN ANALYSIS, 2012, 7 (01) :73-107
[6]   A distinct glucose metabolism signature of acute myeloid leukemia with prognostic value [J].
Chen, Wen-Lian ;
Wang, Jing-Han ;
Zhao, Ai-Hua ;
Xu, Xin ;
Wang, Yi-Huang ;
Chen, Tian-Lu ;
Li, Jun-Min ;
Mi, Jian-Qing ;
Zhu, Yong-Mei ;
Liu, Yuan-Fang ;
Wang, Yue-Ying ;
Jin, Jie ;
Huang, He ;
Wu, De-Pei ;
Li, Yan ;
Yan, Xiao-Jing ;
Yan, Jin-Song ;
Li, Jian-Yong ;
Wang, Shuai ;
Huang, Xiao-Jun ;
Wang, Bing-Shun ;
Chen, Zhu ;
Chen, Sai-Juan ;
Jia, Wei .
BLOOD, 2014, 124 (10) :1645-1654
[7]   Bayesian Semiparametric Model for Pathway-Based Analysis with Zero-Inflated Clinical Outcomes [J].
Cheng, Lulu ;
Kim, Inyoung ;
Pang, Herbert .
JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2016, 21 (04) :641-662
[8]   The joint graphical lasso for inverse covariance estimation across multiple classes [J].
Danaher, Patrick ;
Wang, Pei ;
Witten, Daniela M. .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2014, 76 (02) :373-397
[9]   Semiparametric Kernel-Based Regression for Evaluating Interaction Between Pathway Effect and Covariate [J].
Fang, Zaili ;
Kim, Inyoung ;
Jung, Jeesun .
JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS, 2018, 23 (01) :129-152
[10]   Flexible Variable Selection for Recovering Sparsity in Nonadditive Nonparametric Models [J].
Fang, Zaili ;
Kim, Inyoung ;
Schaumont, Patrick .
BIOMETRICS, 2016, 72 (04) :1155-1163