INFORMATION RANKING AND POWER LAWS ON TREES

被引:38
作者
Jelenkovic, Predrag R. [1 ]
Olvera-Cravioto, Mariana [2 ]
机构
[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[2] Columbia Univ, Dept Ind Engn & Operat Res, New York, NY 10027 USA
关键词
Information ranking; stochastic recursion; stochastic fixed point equation; weighted branching process; power law; regular variation; implicit renewal theory; large deviation; BUSY PERIOD; BRANCHING-PROCESSES; RENEWAL THEORY; FIXED-POINTS; RANDOM-WALK; WEB SEARCH; BEHAVIOR; DISTRIBUTIONS; ASYMPTOTICS; ENVIRONMENT;
D O I
10.1239/aap/1293113151
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In this paper we consider the stochastic analysis of information ranking algorithms of large interconnected data sets, e.g. Google's Page Rank algorithm for ranking pages on the World Wide Web. The stochastic formulation of the problem results in an equation of the form R =(D) Q + Sigma(N)(i=1), CiRi, where N, Q, {R-i}(i >= 1) and {C, C-i}(i >= 1) are independent nonnegative random variables, the C. are identically distributed, and the (R-i}(i >= 1) are independent copies of R; =('D') stands for equality in distribution. We study the asymptotic properties of the distribution of R that, in the context of PageRank, represents the frequencies of highly ranked pages. The preceding equation is interesting in its own right since it belongs to a more general class of weighted branching processes that have been found to be useful in the analysis of many other algorithms. Our first main result shows that if E N E[C-alpha] = 1, alpha > 0, and Q, N satisfy additional moment conditions, then R has a power law distribution of index alpha. This result is obtained using a new approach based on an extension of Goldie's (1991) implicit renewal theorem. Furthermore, when N is regularly varying of index alpha > 1, E N E[C-alpha] < 1. and Q, C have higher moments than a, then the distributions of R and N are tail equivalent. The latter result is derived via a novel sample path large deviation method for recursive random SLIMS. Similarly, we characterize the situation when the distribution of R is determined by the tail of Q. The preceding approaches may be of independent interest, as they can be used for analyzing other functionals on trees. We also briefly discuss the engineering implications of our results.
引用
收藏
页码:1057 / 1093
页数:37
相关论文
共 37 条