Information theory and mixing least-squares regressions

被引:155
作者
Leung, Gilbert [1 ]
Barron, Andrew R.
机构
[1] Qualcomm Inc, Campbell, CA 95008 USA
[2] Yale Univ, Dept Stat, New Haven, CT 06511 USA
关键词
Bayes mixtures; combining least-squares regressions; complexity; model adaptation; model selection target; oracle inequalities; resolvability; sparsity; unbiased risk estimate;
D O I
10.1109/TIT.2006.878172
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
For Gaussian regression, we develop and analyze methods for combining estimators from various models. For squared-error loss, an unbiased estimator of the risk of the mixture of general estimators is developed. Special attention is given to the case that the component estimators are least-squares projections into arbitrary linear subspaces, such as those spanned by subsets of explanatory variables in a given design. We relate the unbiased estimate of the risk of the mixture estimator to estimates of the risks achieved by the components. This results in simple and accurate bounds on the risk and its estimate, in the form of sharp and exact oracle inequalities. That is, without advance knowledge of which model is best, the resulting performance is comparable to or perhaps even superior to what is achieved by the best of the individual models. Furthermore, in the case that the unknown parameter has a sparse representation, our mixture estimator adapts to the underlying sparsity. Simulations show that the performance of these mixture estimators is better than that of a related model-selection estimator which picks a model with the highest weight. Also, the connection between our mixtures with Bayes procedures is discussed.
引用
收藏
页码:3396 / 3410
页数:15
相关论文
共 43 条
[1]   STATISTICAL PREDICTOR IDENTIFICATION [J].
AKAIKE, H .
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1970, 22 (02) :203-&
[2]  
Akaike H, 1973, 2 INT S INFORM THEOR, P199, DOI 10.1007/978-1-4612-1694-0
[3]  
[Anonymous], 2001, Journal of the European Mathematical Society, DOI DOI 10.1007/S100970100031
[4]   Risk bounds for model selection via penalization [J].
Barron, A ;
Birgé, L ;
Massart, P .
PROBABILITY THEORY AND RELATED FIELDS, 1999, 113 (03) :301-413
[5]  
BARRON A, 2006, UNPUB ANN STAT FEB
[6]  
BARRON A, 1987, OPEN PROBLEMS COMMUN
[7]   UNIVERSAL APPROXIMATION BOUNDS FOR SUPERPOSITIONS OF A SIGMOIDAL FUNCTION [J].
BARRON, AR .
IEEE TRANSACTIONS ON INFORMATION THEORY, 1993, 39 (03) :930-945
[8]  
BARRON AR, 1994, MACH LEARN, V14, P115, DOI 10.1007/BF00993164
[9]  
BARRON AR, 1998, BAYESIAN STAT
[10]  
Beran R, 1998, ANN STAT, V26, P1826