Parallel and Cache-Efficient In-Place Matrix Storage Format Conversion

被引:23
作者
Gustavson, Fred
Karlsson, Lars [1 ,2 ]
Kagstrom, Bo [1 ,2 ]
机构
[1] Umea Univ, Dept Comp Sci, SE-90187 Umea, Sweden
[2] Umea Univ, HPC2N, SE-90187 Umea, Sweden
来源
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE | 2012年 / 38卷 / 03期
基金
瑞典研究理事会;
关键词
Algorithms; Performance; Theory; Blocked matrix data layout; in-place matrix transposition; parallel and cache-efficient algorithms; ALGORITHM; TRANSPOSITION;
D O I
10.1145/2168773.2168775
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Techniques and algorithms for efficient in-place conversion to and from standard and blocked matrix storage formats are described. Such functionality is required by numerical libraries that use different data layouts internally. Parallel algorithms and a software package for in-place matrix storage format conversion based on in-place matrix transposition are presented and evaluated. A new algorithm for in-place transposition which efficiently determines the structure of the transposition permutation a priori is one of the key ingredients. It enables effective load balancing in a parallel environment.
引用
收藏
页数:32
相关论文
共 31 条
[1]   COMPUTER ALGORITHM FOR TRANSPOSING NONSQUARE MATRICES [J].
ALLTOP, WO .
IEEE TRANSACTIONS ON COMPUTERS, 1975, 24 (10) :1038-1040
[2]  
[Anonymous], 1998, The Art of Computer Programming
[3]  
BADER M, 2008, P COMP FRONT C COL W, P385
[4]   Cache oblivious matrix multiplication using an element ordering based on a Peano curve [J].
Bader, Michael ;
Zenger, Christoph .
LINEAR ALGEBRA AND ITS APPLICATIONS, 2006, 417 (2-3) :301-313
[5]   A METHOD FOR TRANSPOSING A MATRIX [J].
BERMAN, MF .
JOURNAL OF THE ACM, 1958, 5 (04) :383-384
[6]   TRANSPOSE VECTOR STORED ARRAY [J].
BOOTHROY.J .
COMMUNICATIONS OF THE ACM, 1967, 10 (05) :292-&
[7]  
BREBNER MA, 1970, COMMUN ACM, V13, P324
[8]   ALGORITHM - MATRIX TRANSPOSITION IN PLACE [F1] [J].
BRENNER, N .
COMMUNICATIONS OF THE ACM, 1973, 16 (11) :692-694
[9]  
Cate E. G., 1977, ACM Transactions on Mathematical Software, V3, P104, DOI 10.1145/355719.355729
[10]  
CHAN E, 2008, TR0804 U TEX AUST DE