Weighted random sampling with a reservoir

被引:233
作者
Efraimidis, PS [1 ]
Spirakis, PG
机构
[1] Democritus Univ Thrace, Dept Elect & Comp Engn, GR-67100 Xanthi, Greece
[2] Comp Technol Inst, Patras 26221, Greece
关键词
weighted random sampling; reservoir sampling; randomized algorithms; data streams; parallel algorithms;
D O I
10.1016/j.ipl.2005.11.003
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, a new algorithm for drawing a weighted random sample of size m from a population of n weighted items, where m <= n, is presented. The algorithm can generate a weighted random sample in one-pass over unknown populations. (c) 2005 Elsevier B.V. All rights reserved.
引用
收藏
页码:181 / 185
页数:5
相关论文
共 14 条
[1]  
Babcock B., 2002, Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS), P1, DOI DOI 10.1145/543613.543615
[2]  
Chaudhuri S, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P263, DOI 10.1145/304181.304206
[3]  
Devroye L., 1986, NONUNIFORM RANDOM VA
[4]  
KARGER D, 2002, 34 ANN ACM S THEOR C, P63
[5]  
Karger D. R., 1994, Proceedings of the Twenty-Sixth Annual ACM Symposium on the Theory of Computing, P648, DOI 10.1145/195058.195422
[6]  
Knuth DE, 1981, ART COMPUTER PROGRAM, V2
[7]   RESERVOIR-SAMPLING ALGORITHMS OF TIME-COMPLEXITY O(N(1+LOG(N/N))) [J].
LI, KH .
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 1994, 20 (04) :481-493
[8]  
Lin J.-H., 1992, 24TH P ANN ACM S THE, P771
[9]  
MUTHUKRISHNAN S, 2005, COMPUT SCI, V1
[10]  
Olken Frank, 1993, Random Sampling from Databases