Adaptive Bayesian density regression for high-dimensional data

被引:14
|
作者
Shen, Weining [1 ]
Ghosal, Subhashis [2 ]
机构
[1] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
[2] N Carolina State Univ, Dept Stat, Raleigh, NC 27695 USA
关键词
adaptive estimation; density regression; high-dimensional models; MCMC-free computation; nonparametric Bayesian inference; posterior contraction rate; variable selection; GENERALIZED LINEAR-MODELS; VARIABLE SELECTION; CONVERGENCE-RATES; POSTERIOR DISTRIBUTIONS; GAUSSIAN PROCESS; CONSISTENCY; MIXTURES;
D O I
10.3150/14-BEJ663
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Density regression provides a flexible strategy for modeling the distribution of a response variable Y given predictors X = (X-1 . . . , X-p) by letting that the conditional density of Y given X as a completely unknown function and allowing its shape to change with the value of X. The number of predictors p may be very large, possibly much larger than the number of observations n, but the conditional density is assumed to depend only on a much smaller number of predictors, which are unknown. In addition to estimation, the goal is also to select the important predictors which actually affect the true conditional density. We consider a nonparametric Bayesian approach to density regression by constructing a random series prior based on tensor products of spline functions. The proposed prior also incorporates the issue of variable selection. We show that the posterior distribution of the conditional density contracts adaptively at the truth nearly at the optimal oracle rate, determined by the unknown sparsity and smoothness levels, even in the ultra high-dimensional settings where p increases exponentially with n. The result is also extended to the anisotropic case where the degree of smoothness can vary in different directions, and both random and deterministic predictors are considered. We also propose a technique to calculate posterior moments of the conditional density function without requiring Markov chain Monte Carlo methods.
引用
收藏
页码:396 / 420
页数:25
相关论文
共 50 条