Parallel LU factorization of sparse matrices on FPGA-based configurable computing engines

被引：22

作者：

Wang, XF ^{[1
]}

Ziavras, SG ^{[1
]}

机构：

[1] New Jersey Inst Technol, Dept Elect & Comp Engn, Newark, NJ 07102 USA

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2004年 / 16卷 / 04期

关键词：

FPGA; LU factorization; matrix inversion; parallel processing; hardware design; SOPC/SOC;

D O I：

10.1002/cpe.748

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Configurable computing, where hardware resources are configured appropriately to match specific hardware designs, has recently demonstrated its ability to significantly improve performance for a wide range of computation-intensive applications. With steady advances in silicon technology, as predicted by Moore's Law, Field-Programmable Gate Array (FPGA) technologies have enabled the implementation of System-on-a-Programmable-Chip (SOPC or SOC) computing platforms, which, in turn, have given a significant boost to the field of configurable computing. It is possible to implement various specialized parallel machines in a single silicon chip. In this paper, we describe our design and implementation of a parallel machine on an SOPC development board, using multiple instances of a soft IP configurable processor; we use this machine for LU factorization. LU factorization is widely used in engineering and science to solve efficiently large systems of linear equations. Our implementation facilitates the efficient solution of linear equations at a cost much lower than that of supercomputers and networks of workstations. The intricacies of our FPGA-based design are presented along with tradeoff choices made for the purpose of illustration. Performance results prove the viability of our approach. Copyright (C) 2004 John Wiley Sons, Ltd.

引用

页码：319 / 343

页数：25

共 38 条

[1]

BELL G, 2001, MSRTR200176

[2]

BUELL D., 1996, SPLASH 2 FPGAS CUSTO

[3] DIAKOPTIC AND GENERALIZED HYBRID ANALYSIS [J].

CHUA, LO ;

CHEN, LK .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS, 1976, 23 (12) :694-705

[4]

CLOUTIER J, 1996, 5 INT C MICR NEUR NE, P330

[5] Reconfigurable computing: A survey of systems and software [J].

Compton, K ;

Hauck, S .

ACM COMPUTING SURVEYS, 2002, 34 (02) :171-210

[6]

Cormen T. H., 1994, INTRO ALGORITHMS

[7] An asynchronous parallel supernodal algorithm for sparse Gaussian elimination [J].

Demmel, JW ;

Gilbert, JR ;

Li, XYS .

SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1999, 20 (04) :915-952

[8]

Duff Iain S., 1990, DIRECT METHODS SPARS

[9]

FOSTER I, UNPUB PHYSL GRID OPE

[10]

Frigo J., 2001, P 2001 ACM SIGDA 9 I, P134, DOI DOI 10.1145/360276.360326

← 1 2 3 4 →