Scalable subsampling: computation, aggregation and inference

被引:4
作者
Politis, Dimitris [1 ]
机构
[1] Univ Calif San Diego, Dept Math, 9500 Gilman Dr, La Jolla, CA 92093 USA
基金
美国国家科学基金会;
关键词
Bagging; Big data; Bootstrap; Distributed inference; Subagging; SELECTION;
D O I
10.1093/biomet/asad021
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Subsampling has seen a resurgence in the big data era where the standard, full-resample size bootstrap can be infeasible to compute. Nevertheless, even choosing a single random subsample of size b can be computationally challenging with both b and the sample size n being very large. This paper shows how a set of appropriately chosen, nonrandom subsamples can be used to conduct effective, and computationally feasible, subsampling distribution estimation. Furthermore, the same set of subsamples can be used to yield a procedure for subsampling aggregation, also known as subagging, that is scalable with big data. Interestingly, the scalable subagging estimator can be tuned to have the same, or better, rate of convergence than that of theta<^>n. Statistical inference could then be based on the scalable subagging estimator instead of the original theta<^>n.
引用
收藏
页码:347 / 354
页数:8
相关论文
共 20 条
  • [11] Politis D. N., 2021, Scalable subsampling: computation, aggregation and inference
  • [12] Politis Dimitris N., 1999, Subsampling
  • [13] LARGE-SAMPLE CONFIDENCE-REGIONS BASED ON SUBSAMPLES UNDER MINIMAL ASSUMPTIONS
    POLITIS, DN
    ROMANO, JP
    [J]. ANNALS OF STATISTICS, 1994, 22 (04) : 2031 - 2050
  • [14] A Subsampled Double Bootstrap for Massive Data
    Sengupta, Srijan
    Volgushev, Stanislav
    Shao, Xiaofeng
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (515) : 1222 - 1232
  • [15] CONVOLVED SUBSAMPLING ESTIMATION WITH APPLICATIONS TO BLOCK BOOTSTRAP
    Tewes, Johannes
    Politis, Dimitris N.
    Nordman, Daniel J.
    [J]. ANNALS OF STATISTICS, 2019, 47 (01) : 468 - 496
  • [17] Ting DN, 2021, Arxiv, DOI arXiv:2104.05091
  • [18] Yao Y., 2021, J. Data Sci, V19, P1
  • [19] Zhang YC, 2013, J MACH LEARN RES, V14, P3321
  • [20] Zou T, 2021, Arxiv, DOI arXiv:2103.00631