Sampling With Replacement vs Poisson Sampling: A Comparative Study in Optimal Subsampling

被引:20
作者
Wang, Jing [1 ]
Zou, Jiahui [2 ]
Wang, HaiYing [1 ]
机构
[1] Univ Connecticut, Dept Stat, Storrs, CT 06269 USA
[2] Capital Univ Econ & Business, Sch Stat, Beijing 100070, Peoples R China
关键词
Estimation; Probability; Computational efficiency; Approximation algorithms; Distributed databases; Convergence; Training; Algorithmic sampling; asymptotic distribution; informative sample; massive data; MONTE-CARLO ALGORITHMS; MATRICES; APPROXIMATION;
D O I
10.1109/TIT.2022.3176955
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Faced with massive data, subsampling is a commonly used technique to improve computational efficiency, and using nonuniform subsampling probabilities is an effective approach to improve estimation efficiency. For computational efficiency, subsampling is often implemented with replacement or through Poisson subsampling. However, no rigorous investigation has been performed to study the difference between the two subsampling procedures such as their estimation efficiency and computational convenience. This paper performs a comparative study on these two different sampling procedures. In the context of maximizing a general target function, we first derive asymptotic distributions for estimators obtained from the two sampling procedures. The results show that the Poisson subsampling may have a higher estimation efficiency. Based on the asymptotic distributions for both subsampling with replacement and Poisson subsampling, we derive optimal subsampling probabilities that minimize the variance functions of the subsampling estimators. These subsampling probabilities further reveal the similarities and differences between subsampling with replacement and Poisson subsampling. The theoretical characterizations and comparisons on the two subsampling procedures provide guidance to select a more appropriate subsampling approach in practice. Furthermore, practically implementable algorithms are proposed based on the optimal structural results, which are evaluated through both theoretical and empirical analyses.
引用
收藏
页码:6605 / 6630
页数:26
相关论文
共 50 条
  • [21] A study of classical frequency sampling filters
    Ji, K
    Jiang, XH
    Zhu, HF
    Ali, E
    [J]. PROCEEDINGS OF THE FOURTH IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY, 2004, : 495 - 498
  • [22] Exploration Using Without-Replacement Sampling of Actions Is Sometimes Inferior
    Carden, Stephen W.
    Walker, S. Dalton
    [J]. MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (02): : 698 - 714
  • [23] Optimal updating magnitude in adaptive flat-distribution sampling
    Zhang, Cheng
    Drake, Justin A.
    Ma, Jianpeng
    Pettitt, B. Montgomery
    [J]. JOURNAL OF CHEMICAL PHYSICS, 2017, 147 (17)
  • [24] Optimal learning for sequential sampling with non-parametric beliefs
    Barut, Emre
    Powell, Warren B.
    [J]. JOURNAL OF GLOBAL OPTIMIZATION, 2014, 58 (03) : 517 - 543
  • [25] Proportional Volume Sampling and Approximation Algorithms for A-Optimal Design
    Nikolov, Aleksandar
    Singh, Mohit
    Tantipongpipat, Uthaipon
    [J]. MATHEMATICS OF OPERATIONS RESEARCH, 2022, 47 (02) : 847 - 877
  • [26] A Hierarchy of Twofold Resource Allocation Automata Supporting Optimal Sampling
    Granmo, Ole-Christoffer
    Oommen, B. John
    [J]. NEXT-GENERATION APPLIED INTELLIGENCE, PROCEEDINGS, 2009, 5579 : 523 - +
  • [27] Near-Optimal Entrywise Sampling of Numerically Sparse Matrices
    Braverman, Vladimir
    Krauthgamer, Robert
    Krishnan, Aditya
    Sapir, Shay
    [J]. CONFERENCE ON LEARNING THEORY, VOL 134, 2021, 134 : 759 - 773
  • [28] An Incremental Sampling-based Algorithm for Stochastic Optimal Control
    Vu Anh Huynh
    Karaman, Sertac
    Frazzoli, Emilio
    [J]. 2012 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2012, : 2865 - 2872
  • [29] An incremental sampling-based algorithm for stochastic optimal control
    Huynh, Vu Anh
    Karaman, Sertac
    Frazzoli, Emilio
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2016, 35 (04) : 305 - 333
  • [30] Optimal replica exchange method combined with Tsallis weight sampling
    Kim, Jaegil
    Straub, John E.
    [J]. JOURNAL OF CHEMICAL PHYSICS, 2009, 130 (14)