Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication Using Compressed Sparse Blocks

被引:0
作者
Buluc, Aydin [1 ]
Fineman, Jeremy T.
Frigo, Matteo
Gilbert, John R. [1 ]
Leiserson, Charles E.
机构
[1] Univ Calif Santa Barbara, Dept Comp Sci, Santa Barbara, CA 93106 USA
来源
SPAA'09: PROCEEDINGS OF THE TWENTY-FIRST ANNUAL SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES | 2009年
基金
美国国家科学基金会;
关键词
Compressed sparse blocks; compressed sparse columns; compressed sparse rows; matrix transpose; matrix-vector multiplication; multithreaded algorithm; parallelism; span; sparse matrix; storage format; work;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper introduces a storage format for sparse matrices, called compressed sparse blocks (CSB), which allows both Ax and A(x)(inverted perpendicular) to be computed efficiently in parallel, where A is an n x n sparse matrix with nnz >= n nonzeros and x is a dense n-vector. Our algorithms use Theta(nnz) work (serial running time) and Theta(root nlgn) span (critical-path length), yielding a parallelism of Theta(nnz/root nlgn), which is amply high for virtually any large matrix. The storage requirement for CSB is esssentially the same as that for the more-standard compressed-sparse-rows (CSR) format, for which computing Ax in parallel is easy but A(x)(inverted perpendicular) is difficult. Benchmark results indicate that on one processor, the CSB algorithms for Ax and A(x)(inverted perpendicular) run just as fast as the CSR algorithm for Ax, but the CSB algorithms also scale up linearly with processors until limited by off-chip memory bandwidth.
引用
收藏
页码:233 / 244
页数:12
相关论文
共 40 条
[1]  
Adams M.D., 2006, MSPC 06, P41
[2]  
Bader D., HPCS SCALABLE SYNTHE
[3]  
BLELLOCH GE, 1996, CACM, V39
[4]  
Blumofe R. D., 1999, Journal of the ACM, V46, P720, DOI [10.1109/SFCS.1994.365680, 10.1145/324133.324234]
[5]  
Blumofe R. D., 1995, SIGPLAN Notices, V30, P207, DOI 10.1145/209937.209958
[6]  
Buluç A, 2008, 2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, P1876
[7]  
CATALYUREK UV, 2001, INT PAR DISTR PROC S, P118
[8]  
Cho J., 2006, ACM Transactions on Internet Technology, V6, P153, DOI 10.1145/1149121.1149124
[9]  
*CILK ARTS INC, 2009, CILK PROGR GUID
[10]  
Cormen T.H., 2009, INTRO ALGORITHMS, P651