PARTITIONED APPROACH FOR HIGH-DIMENSIONAL CONFIDENCE INTERVALS WITH LARGE SPLIT SIZES

被引:1
|
作者
Zheng, Zemin [1 ]
Zhang, Jiarui [1 ]
Li, Yang [1 ]
Wu, Yaohua [1 ]
机构
[1] Univ Sci & Technol, Hefei, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
Big data; confidence intervals; de-biased estimator; divide and conquer; large split sizes; scalability; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; DANTZIG SELECTOR; REGRESSION; SHRINKAGE; INFERENCE; LASSO;
D O I
10.5705/ss.202018.0379
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
With the availability of massive data sets, accurate inferences with low computational costs are the key to improving scalability. When the sample size and dimensionality are both large, naively applying de-biasing to derive confidence intervals can be computationally inefficient or infeasible, because the de-biasing procedure increases the computational cost by an order of magnitude compared with that of the initial penalized estimation. Therefore, we suggest a split and conquer approach to improve the scalability of the de-biasing procedure, and show that the length of the established confidence interval is asymptotically the same as that using all of the data. Moreover, we demonstrate a significant improvement in the largest split size by separating the initial estimation and the relaxed projection steps, indicating that the sample sizes needed for these two steps with statistical guarantees are different. We propose a refined inference procedure to address the inflation issue in the finite sample performance when the split size becomes large. Lastly, numerical studies demonstrate the computational advantage and theoretical guarantee of our new methodology.
引用
收藏
页码:1935 / 1959
页数:25
相关论文
共 50 条
  • [1] CONFIDENCE INTERVALS FOR HIGH-DIMENSIONAL COX MODELS
    Yu, Yi
    Bradic, Jelena
    Samworth, Richard J.
    STATISTICA SINICA, 2021, 31 (01) : 243 - 267
  • [2] Confidence intervals and hypothesis testing for high-dimensional regression
    Javanmard, Adel
    Montanari, Andrea
    Journal of Machine Learning Research, 2014, 15 : 2869 - 2909
  • [3] Confidence intervals for high-dimensional inverse covariance estimation
    Jankova, Jana
    van de Geer, Sara
    ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01): : 1205 - 1229
  • [4] Confidence Intervals and Hypothesis Testing for High-Dimensional Regression
    Javanmard, Adel
    Montanari, Andrea
    JOURNAL OF MACHINE LEARNING RESEARCH, 2014, 15 : 2869 - 2909
  • [5] Confidence Intervals and Tests for High-Dimensional Models: A Compact Review
    Buhlmann, Peter
    MODELING AND STOCHASTIC LEARNING FOR FORECASTING IN HIGH DIMENSIONS, 2015, 217 : 21 - 34
  • [6] Rank Conditional Coverage and Confidence Intervals in High-Dimensional Problems
    Morrison, Jean
    Simon, Noah
    JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (03) : 648 - 656
  • [7] Confidence intervals for parameters in high-dimensional sparse vector autoregression
    Zhu, Ke
    Liu, Hanzhong
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2022, 168
  • [8] Sparsified simultaneous confidence intervals for high-dimensional linear models
    Zhu, Xiaorui
    Qin, Yichen
    Wang, Peng
    METRIKA, 2024,
  • [9] CONFIDENCE INTERVALS FOR HIGH-DIMENSIONAL LINEAR REGRESSION: MINIMAX RATES AND ADAPTIVITY
    Cai, T. Tony
    Guo, Zijian
    ANNALS OF STATISTICS, 2017, 45 (02): : 615 - 646
  • [10] Empirical Bayes Confidence Intervals for Selected Parameters in High-Dimensional Data
    Hwang, J. T. Gene
    Zhao, Zhigen
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2013, 108 (502) : 607 - 618