Weighted random sampling with a reservoir

被引:217
作者
Efraimidis, PS [1 ]
Spirakis, PG
机构
[1] Democritus Univ Thrace, Dept Elect & Comp Engn, GR-67100 Xanthi, Greece
[2] Comp Technol Inst, Patras 26221, Greece
关键词
weighted random sampling; reservoir sampling; randomized algorithms; data streams; parallel algorithms;
D O I
10.1016/j.ipl.2005.11.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, a new algorithm for drawing a weighted random sample of size m from a population of n weighted items, where m <= n, is presented. The algorithm can generate a weighted random sample in one-pass over unknown populations. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:181 / 185
页数:5
相关论文
共 14 条
  • [1] Babcock B., 2002, Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), P1, DOI DOI 10.1145/543613.543615
  • [2] Chaudhuri S, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P263, DOI 10.1145/304181.304206
  • [3] Devroye L., 1986, NONUNIFORM RANDOM VA
  • [4] KARGER D, 2002, 34 ANN ACM S THEOR C, P63
  • [5] Karger D. R., 1994, Proceedings of the Twenty-Sixth Annual ACM Symposium on the Theory of Computing, P648, DOI 10.1145/195058.195422
  • [6] Knuth DE, 1981, ART COMPUTER PROGRAM, V2
  • [7] RESERVOIR-SAMPLING ALGORITHMS OF TIME-COMPLEXITY O(N(1+LOG(N/N)))
    LI, KH
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1994, 20 (04): : 481 - 493
  • [8] Lin J.-H., 1992, 24TH P ANN ACM S THE, P771
  • [9] MUTHUKRISHNAN S, 2005, COMPUT SCI, V1
  • [10] Olken Frank, 1993, Random Sampling from Databases