QR factorization for shared memory and message passing

被引：1

作者：

Dunn, IN ^{[1
]}

Meyer, GGL ^{[1
]}

机构：

[1] Johns Hopkins Univ, Dept Elect & Comp Engn, Baltimore, MD 21218 USA

来源：

PARALLEL COMPUTING | 2002年 / 28卷 / 11期

关键词：

QR factorization; message passing systems; performance evaluation; shared memory systems; Givens rotations;

D O I：

10.1016/S0167-8191(02)00162-X

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

This paper describes the design, implementation, and performance of three new parallel QR factorization algorithms: shared memory, synchronous message passing, and asynchronous message passing. In contrast to existing parallel algorithms, the multiprocessor partitioning strategy is not governed by an underlying static data distribution scheme. Rather, a dynamic distribution strategy is employed to improve scalability on small problems. Experiments conducted on a 128-processor SGI Origin 2000 and a 64-processor HP SPP-2000 show that the new algorithms have a lower execution time than available tuned parallel routines installed on the machines including a version of ScaLAPACK's distributed QR factorization algorithm PDGEQRF. (C) 2002 Elsevier Science B.V. All rights reserved.

引用

页码：1507 / 1530

页数：24

共 18 条

[1]

Blackford L. S., 1997, ScaLAPACK user's guide

[2] A parameterized ordering for cache-, register- and pipeline-efficient Givens QR decomposition [J].

James J. Carrig ;

Gerard G.L. Meyer .

Advances in Computational Mathematics, 1999, 10 (1) :97-113

[3]

CHOI J, 1994, TM12470 ORNL

[4] Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs [J].

Darte, A ;

Vivien, F .

INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 1997, 25 (06) :447-496

[5]

Golub G. H., 2013, Matrix Computations

[6] A framework for efficient data redistribution on distributed memory multicomputers [J].

Guo, MY ;

Nakata, I .

JOURNAL OF SUPERCOMPUTING, 2001, 20 (03) :243-265

[7]

Hwang K., 1993, Advanced Computer Architecture: Parallelism. Scalability

[8] GRAIN-SIZE DETERMINATION FOR PARALLEL PROCESSING [J].

KRUATRACHUE, B ;

LEWIS, T .

IEEE SOFTWARE, 1988, 5 (01) :23-32

[9] Maximizing parallelism and minimizing synchronization with affine partitions [J].

Lim, AW ;

Lam, MS .

PARALLEL COMPUTING, 1998, 24 (3-4) :445-475

[10]

*MESS PASS INT FOR, 1997, MPI MESS PASS INT ST

← 1 2 →