RESERVOIR-SAMPLING ALGORITHMS OF TIME-COMPLEXITY O(N(1+LOG(N/N)))

被引:50
作者
LI, KH
机构
[1] Chinese University of Hong Kong, Shatin, N.T.
来源
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE | 1994年 / 20卷 / 04期
关键词
ANALYSIS OF ALGORITHMS; RANDOM SAMPLING; RESERVOIR;
D O I
10.1145/198429.198435
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
One-pass algorithms for sampling n records without replacement from a population of unknown size N are known as reservoir-sampling algorithms. In this article, Vitter's reservoir-sampling algorithm, algorithm Z, is modified to give a more efficient algorithm, algorithm K. Additionally, two new algorithms, algorithm L and algorithm M, are proposed. If the time for scanning the population is ignored, all the four algorithms have expected CPU time O(n(1 + log(N/n))), which is optimum up to a constant factor. Expressions of the expected CPU time for the algorithms are presented. Among the four, algorithm L is the simplest, and algorithm M is the most efficient when n and N/n are large and N is O(n(2)).
引用
收藏
页码:481 / 493
页数:13
相关论文
共 10 条
[1]  
Bratley P, 1987, GUIDE SIMULATION, V2nd
[2]   GENERATING BETA VARIATES WITH NONINTEGRAL SHAPE PARAMETERS [J].
CHENG, RCH .
COMMUNICATIONS OF THE ACM, 1978, 21 (04) :317-322
[3]  
Devroye L., 1986, NONUNIFORM RANDOM VA
[4]   DEVELOPMENT OF SAMPLING PLANS BY USING SEQUENTIAL (ITEM BY ITEM) SELECTION TECHNIQUES AND DIGITAL-COMPUTERS [J].
FAN, CT ;
MULLER, ME ;
REZUCHA, I .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1962, 57 (298) :387-&
[5]  
KNUTH DE, 1981, RT COMPUTER PROGRAMM, V2
[6]  
LI KH, 1992, 923 CHIN U HONG KONG
[7]  
LI KH, 1992, 922 CHIN U HONG KONG
[8]  
MCLEOD AI, 1983, APPL STAT-J ROY ST C, V32, P182
[9]  
Pinkham RS, 1987, APPL STAT-J ROY ST C, V36, P370
[10]  
1987, IMSL STAT LIBRARY US