FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture

被引:42
作者
Jaiswal, Manish Kumar [1 ]
Chandrachoodan, Nitin [2 ]
机构
[1] ICFAI Univ, Dehra Dun, India
[2] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
关键词
LU decomposition; block LU; FPGA; hardware acceleration; floating point arithmetics; single/double precision; scaling; ATLAS; Intel-MKL; GPU; LINEAR ALGEBRA; STABILITY;
D O I
10.1109/TC.2011.24
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Decomposition of a matrix into lower and upper triangular matrices (LU decomposition) is a vital part of many scientific and engineering applications, and the block LU decomposition algorithm is an approach well suited to parallel hardware implementation. This paper presents an approach to speed up implementation of the block LU decomposition algorithm using FPGA hardware. Unlike most previous approaches reported in the literature, the approach does not assume the matrix can be stored entirely on chip. The memory accesses are studied for various FPGA configurations, and a schedule of operations for scaling well is shown. The design has been synthesized for FPGA targets and can be easily retargeted. The design outperforms previous hardware implementations, as well as tuned software implementations including the ATLAS and MKL libraries on workstations.
引用
收藏
页码:60 / 72
页数:13
相关论文
共 32 条
  • [21] KUNG HT, 1991, SUPERCOMPUTING 91, P122
  • [22] SUDARSANAM A, 2006, P 10 HIGH PERF EMB C
  • [23] High-Performance Mixed-Precision Linear Solver for FPGAs
    Sun, Junqing
    Peterson, Gregory D.
    Storaasli, Olaf O.
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2008, 57 (12) : 1614 - 1623
  • [24] TOMOV S, 2010, P INT WORKSH HIGH LE
  • [25] VOLKOV V, 2008, SC 08, P1, DOI DOI 10.1145/1413370.1413402
  • [26] VONLASZEWSKI G, 1992, SUPERCOMPUTING 92 : PROCEEDINGS, P170
  • [27] Portable and Scalable FPGA-Based Acceleration of a Direct Linear System Solver
    Zhang, Wei
    Betz, Vaughn
    Rose, Jonathan
    [J]. PROCEEDINGS OF THE 2008 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY, 2008, : 17 - +
  • [28] Zhang Y, 2008, LECT NOTES COMPUT SC, V4967, P78
  • [29] Zhuo L., 2006, P INT C FIELD PROGR, P1
  • [30] High-performance designs for linear algebra operations on reconfigurable hardware
    Zhuo, Ling
    Prasanna, Viktor K.
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 2008, 57 (08) : 1057 - 1071