Static LU decomposition on heterogeneous platforms

被引:9
作者
Beaumont, O [1 ]
Legrand, A [1 ]
Rastello, F [1 ]
Robert, Y [1 ]
机构
[1] Ecole Normale Super Lyon, ENS, INRIA, CNRS,UMR 5668,LIP, F-69364 Lyon 07, France
关键词
D O I
10.1177/109434200101500308
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the authors deal with algorithmic issues on heterogeneous platforms. They concentrate on dense linear algebra kernels, such as matrix multiplication or LU decomposition. Block-cyclic distribution techniques used in ScaLAPACK are no longer sufficient to balance the load among processors running at different speeds. The main result of this paper is to provide a static data distribution scheme that leads to an asymptotically perfect load balancing for LU decomposition, thereby providing solid foundations toward the design of a cluster-oriented version of ScaLAPACK.
引用
收藏
页码:310 / 323
页数:14
相关论文
共 13 条
[1]   A HIGH-PERFORMANCE MATRIX-MULTIPLICATION ALGORITHM ON A DISTRIBUTED-MEMORY PARALLEL COMPUTER, USING OVERLAPPED COMMUNICATION [J].
AGARWAL, RC ;
GUSTAVSON, FG ;
ZUBAIR, M .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1994, 38 (06) :673-681
[2]  
[Anonymous], 1998, GRID BLUEPRINT NEW C
[3]  
BEAUMONT O, 2000, RR200010 LIP ENS
[4]  
BEAUMONT O, 2000, RR200002 LIP ENS
[5]  
BEAUMONT O, 2000, RR200024 LIP ENS
[6]  
BERMAN F, 1998, FRID BLUEPRINT NEW C, P279
[7]  
Blackford L. S., 1997, ScaLAPACK user's guide
[8]  
Boulet P., 1999, Parallel Processing Letters, V9, P197, DOI 10.1142/S0129626499000207
[9]  
CRANDALL P, 1993, 2 INT S HIGH PERF DI, P42
[10]  
FOX G, 1987, PARALLEL COMPUT, V3, P17