Sequential Random Sampling Revisited: Hidden Shuffle Method

被引:0
|
作者
Shekelyan, Michael [1 ]
Cormode, Graham [1 ]
机构
[1] Univ Warwick, Coventry, W Midlands, England
来源
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS) | 2021年 / 130卷
基金
欧洲研究理事会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Random sampling (without replacement) is ubiquitously employed to obtain a representative subset of the data. Unlike common methods, sequential methods report samples in ascending order of index without keeping track of previous samples. This enables lightweight iterators that can jump directly from one sampled position to the next. Previously, sequential methods focused on drawing from the distribution of gap sizes, which requires intricate algorithms that are difficult to validate and can be slow in the worst-case. This can be avoided by a new method, the Hidden Shuffle. The name mirrors the fact that although the algorithm does not resemble shuffliing, its correctness can be proven by conceptualising the sampling process as a random shuffle. The Hidden Shuffle algorithm stores just a handful of values, can be implemented in few lines of code, offers strong worst-case guarantees and is shown to be faster than state-of-the-art methods while using comparably few random variates.
引用
收藏
页数:9
相关论文
共 50 条
  • [41] Random Weighting Method for Estimating Sampling Distributions
    Gao, B. B.
    Gao, S. S.
    Yang, Y.
    Yan, H. F.
    INTERNATIONAL CONFERENCE ON ADVANCES IN MANAGEMENT ENGINEERING AND INFORMATION TECHNOLOGY (AMEIT 2015), 2015, : 530 - 537
  • [42] On the power of random bases in Fourier sampling:: Hidden subgroup problem in the Heisenberg group
    Radhakrishnan, J
    Rötteler, M
    Sen, P
    AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2005, 3580 : 1399 - 1411
  • [43] The optimizing property of the method of sequential-sampling identification
    Antsiperov, VE
    RADIOTEKHNIKA I ELEKTRONIKA, 1996, 41 (02): : 204 - 205
  • [44] METHOD FOR SEQUENTIAL SAMPLING OF CEREBROSPINAL-FLUID IN HUMANS
    BRUCE, JN
    OLDFIELD, EH
    NEUROSURGERY, 1988, 23 (06) : 788 - 790
  • [45] Improved bounds for the mixing time of the random-to-random shuffle
    Qin, Chuan
    Morris, Ben
    ELECTRONIC COMMUNICATIONS IN PROBABILITY, 2017, 22
  • [46] Optimizing property of the method of sequential-sampling identification
    Antsiperov, V.E.
    Journal of Communications Technology and Electronics, 1996, 41 (02): : 185 - 186
  • [47] Respondent-driven sampling:: a new sampling method to study visible and hidden populations
    Mantecon, Alejandro
    Juan, Montse
    Calafat, Amador
    Becona, Elisardo
    Roman, Encarna
    ADICCIONES, 2008, 20 (02) : 161 - 169
  • [48] HIDDEN CURRICULUM REVISITED
    SHIRK, E
    JOURNAL OF THOUGHT, 1976, 11 (01) : 53 - 57
  • [49] HIDDEN ULSTER REVISITED
    OSNODAIGH, P
    CRANE BAG, 1981, 5 (02): : 45 - 47
  • [50] HIDDEN VARIABLES REVISITED
    JAUCH, JM
    PIRON, C
    REVIEWS OF MODERN PHYSICS, 1968, 40 (01) : 228 - &