Iterative Sparse Matrix-Vector Multiplication for Integer Factorization on GPUs

被引:0
作者
Schmidt, Bertil [1 ]
Aribowo, Hans [1 ]
Dang, Hoang-Vu [1 ]
机构
[1] Nanyang Technol Univ, Sch Comp Engn, Singapore 639798, Singapore
来源
EURO-PAR 2011 PARALLEL PROCESSING, PT 2 | 2011年 / 6853卷
关键词
SpMV; CUDA; Block Wiedemann; RSA; Number Field Sieve; Factorization;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The Block Wiedemann (BW) and the Block Lanczos (BL) algorithms are frequently used to solve sparse linear systems over GF(2). Iterative sparse matrix-vector multiplication is the most time consuming operation of these approaches. The necessity to accelerate this step is motivated by the application of these algorithms to very large matrices used in the linear algebra step of the Number Field Sieve (NFS) for integer factorization. In this paper we derive an efficient CUDA implementation of this operation using a newly designed hybrid sparse matrix format. This leads to speedups between 4 and 8 on a single GPU for a number of tested NFS matrices compared to an optimized multi-core implementation.
引用
收藏
页码:413 / 424
页数:12
相关论文
共 16 条
[1]  
Aoki K., 2007, ASIACRYPT
[2]  
Aoki K, 2007, LECT NOTES COMPUT SC, V4752, P58
[3]  
Bell N., 2008, Efficient sparse matrix-vector multiplication on CUDA
[4]  
Bell Nathan., 2010, Cusp: Generic parallel algorithms for sparse matrix and graph computations
[5]  
Bonenberger D., 2010, FACTORIZATION RSA 17
[6]  
Boyer B., 2010, CORR
[7]   Model-driven Autotuning of Sparse Matrix-Vector Multiply on GPUs [J].
Choi, Jee W. ;
Singh, Amik ;
Vuduc, Richard W. .
ACM SIGPLAN NOTICES, 2010, 45 (05) :115-125
[8]  
Coppersmith D., 1994, MATH COMPUTATION, V62
[9]  
Hwang W, 2006, LECT NOTES COMPUT SC, V4297, P375
[10]  
Kleinjung T., 2010, 11 ACM IEEE INT C GR