A parallel IRAM algorithm to compute PageRank for modeling epidemic spread

被引:4
作者
Liu, Zifan [1 ,2 ]
Emad, Nahid [1 ,2 ]
Ben Amor, Soufian [1 ]
Lamure, Michel [3 ]
机构
[1] Univ Versailles, 45 Ave Etats Unis, F-78035 Versailles, France
[2] CEA Saclay, Maison La Simulat, bat 565, F-91191 Gif Sur Yvette, France
[3] Univ Lyon 1, F-69676 Bron, France
来源
2013 25TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING (SBAC-PAD) | 2013年
关键词
Epidemic; PageRank; Scale free networks; Power law; IRAM; Big data;
D O I
10.1109/SBAC-PAD.2013.2
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The eigenvalue equation intervenes in models of infectious disease propagation and could be used as an ally of vaccination campaigns in the actions carried out by health care organizations. The stochastic model based on PageRank allows to simulate the epidemic spread, where a PageRank-like infection vector is calculated to help establish efficient vaccination strategy. In the context of epidemic spread, generally the damping factor is high. This is because the probability that an infected individual contaminates any other individual through some unusual contact is low. One consequence of this results is that the second largest eigenvalue of PageRank matrix could be very close to its dominant eigenvalue. Another difficulty arises from the growing size of real networks. Handling very big graph becomes a challenge for computing PageRank. Furthermore, the high damping factor makes many existing algorithms less efficient. In this paper, we explore the computation methods of PageRank to address these issues. Specifically, we study the implicitly restarted Arnoldi method (IRAM) and discuss some possible improvements over it. We also present a parallel implementation for IRAM, targeting big data and sparse matrices representing scale-free networks (also known as power law networks). The algorithm is tested on a nation wide cluster of clusters Grid5000. Experiments on very large networks such as twitter, yahoo (over 1 billion nodes) are conducted.
引用
收藏
页码:120 / 127
页数:8
相关论文
共 25 条
[1]  
[Anonymous], 1999, TECH REPORT STANFORD
[2]  
[Anonymous], 2000, TEMPLATES SOLUTION A
[3]   Emergence of scaling in random networks [J].
Barabási, AL ;
Albert, R .
SCIENCE, 1999, 286 (5439) :509-512
[4]  
BENNANI M, 1994, TRPA9422 CERFACS
[5]   A Survey on PageRank Computing [J].
Berkhin, Pavel .
INTERNET MATHEMATICS, 2005, 2 (01) :73-120
[6]  
Bisset K., COMMUNICATION APR
[7]   EpiFast: A Fast Algorithm for Large Scale Realistic Epidemic Simulations on Distributed Memory Systems [J].
Bisset, Keith R. ;
Chen, Jiangzhuo ;
Feng, Xizhou ;
Kumar, V. S. Anil ;
Marathe, Madhav V. .
ICS'09: PROCEEDINGS OF THE 2009 ACM SIGARCH INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, 2009, :430-439
[8]  
Broder AndreiZ., 2004, WWW ALT 04 P 13 INT, P484
[9]   The $25,000,000,000 eigenvector: The linear algebra behind google [J].
Bryan, Kurt ;
Leise, Tanya .
SIAM REVIEW, 2006, 48 (03) :569-581
[10]   REORTHOGONALIZATION AND STABLE ALGORITHMS FOR UPDATING GRAM-SCHMIDT QR FACTORIZATION [J].
DANIEL, JW ;
GRAGG, WB ;
KAUFMAN, L ;
STEWART, GW .
MATHEMATICS OF COMPUTATION, 1976, 30 (136) :772-795