GAUSSIAN APPROXIMATIONS AND MULTIPLIER BOOTSTRAP FOR MAXIMA OF SUMS OF HIGH-DIMENSIONAL RANDOM VECTORS

被引:263
作者
Chernozhukov, Victor [1 ,2 ]
Chetverikov, Denis [3 ]
Kato, Kengo [4 ]
机构
[1] MIT, Dept Econ, Cambridge, MA 02142 USA
[2] MIT, Ctr Operat Res, Cambridge, MA 02142 USA
[3] Univ Calif Los Angeles, Dept Econ, Los Angeles, CA 90095 USA
[4] Univ Tokyo, Grad Sch Econ, Bunkyo Ku, Tokyo 1130033, Japan
基金
美国国家科学基金会; 日本学术振兴会;
关键词
Dantzig selector; Slepian; Stein method; maximum of vector sums; high dimensionality; anti-concentration; DANTZIG SELECTOR; DENSITY-ESTIMATION; MODELS; LASSO; TESTS;
D O I
10.1214/13-AOS1161
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We derive a Gaussian approximation result for the maximum of a sum of high-dimensional random vectors. Specifically, we establish conditions under which the distribution of the maximum is approximated by that of the maximum of a sum of the Gaussian random vectors with the same covariance matrices as the original vectors. This result applies when the dimension of random vectors (p) is large compared to the sample size (n); in fact, p can be much larger than n, without restricting correlations of the coordinates of these vectors. We also show that the distribution of the maximum of a sum of the random vectors with unknown covariance matrices can be consistently estimated by the distribution of the maximum of a sum of the conditional Gaussian random vectors obtained by multiplying the original vectors with i.i.d. Gaussian multipliers. This is the Gaussian multiplier (or wild) bootstrap procedure. Here too, p can be large or even much larger than n. These distributional approximations, either Gaussian or conditional Gaussian, yield a high-quality approximation to the distribution of the original maximum, often with approximation error decreasing polynomially in the sample size, and hence are of interest in many applications. We demonstrate how our Gaussian approximations and the multiplier bootstrap can be used for modern high-dimensional estimation, multiple hypothesis testing, and adaptive specification testing. All these results contain nonasymptotic bounds on approximation errors.
引用
收藏
页码:2786 / 2819
页数:34
相关论文
共 42 条
[1]   Generalization of l1 constraints for high dimensional regression problems [J].
Alquier, Pierre ;
Hebiri, Mohamed .
STATISTICS & PROBABILITY LETTERS, 2011, 81 (12) :1760-1765
[2]   Multiple Inference and Gender Differences in the Effects of Early Intervention: A Reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects [J].
Anderson, Michael L. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (484) :1481-1495
[3]  
[Anonymous], 2002, CAMBRIDGE SERIES STA
[4]  
[Anonymous], 2003, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge. A Series of Modern Surveys in Mathematics
[5]   SOME NONASYMPTOTIC RESULTS ON RESAMPLING IN HIGH DIMENSION, I: CONFIDENCE REGIONS [J].
Arlot, Sylvain ;
Blanchard, Gilles ;
Roquain, Etienne .
ANNALS OF STATISTICS, 2010, 38 (01) :51-82
[6]   Least squares after model selection in high-dimensional sparse models [J].
Belloni, Alexandre ;
Chernozhukov, Victor .
BERNOULLI, 2013, 19 (02) :521-547
[7]   SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR [J].
Bickel, Peter J. ;
Ritov, Ya'acov ;
Tsybakov, Alexandre B. .
ANNALS OF STATISTICS, 2009, 37 (04) :1705-1732
[8]   HUNGARIAN CONSTRUCTIONS FROM THE NONASYMPTOTIC VIEWPOINT [J].
BRETAGNOLLE, J ;
MASSART, P .
ANNALS OF PROBABILITY, 1989, 17 (01) :239-256
[9]  
BUHLMANN P., 2011, STAT HIGH DIMENSIONA
[10]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523