FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture

被引:42
|
作者
Jaiswal, Manish Kumar [1 ]
Chandrachoodan, Nitin [2 ]
机构
[1] ICFAI Univ, Dehra Dun, India
[2] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
关键词
LU decomposition; block LU; FPGA; hardware acceleration; floating point arithmetics; single/double precision; scaling; ATLAS; Intel-MKL; GPU; LINEAR ALGEBRA; STABILITY;
D O I
10.1109/TC.2011.24
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Decomposition of a matrix into lower and upper triangular matrices (LU decomposition) is a vital part of many scientific and engineering applications, and the block LU decomposition algorithm is an approach well suited to parallel hardware implementation. This paper presents an approach to speed up implementation of the block LU decomposition algorithm using FPGA hardware. Unlike most previous approaches reported in the literature, the approach does not assume the matrix can be stored entirely on chip. The memory accesses are studied for various FPGA configurations, and a schedule of operations for scaling well is shown. The design has been synthesized for FPGA targets and can be easily retargeted. The design outperforms previous hardware implementations, as well as tuned software implementations including the ATLAS and MKL libraries on workstations.
引用
收藏
页码:60 / 72
页数:13
相关论文
共 50 条
  • [1] A FPGA-BASED RECONFIGURABLE PARALLEL ARCHITECTURE FOR HIGH-PERFORMANCE NUMERICAL COMPUTATION
    Ferlin, Edson Pedro
    Lopes, Heitor Silverio
    Erig Lima, Carlos R.
    Perretto, Mauricio
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2011, 20 (05) : 849 - 865
  • [2] A HIGH-PERFORMANCE FPGA-BASED FUZZY PROCESSOR ARCHITECTURE FOR MEDICAL DIAGNOSIS
    Chowdhury, Shubhajit Roy
    Saha, Hiranmay
    IEEE MICRO, 2008, 28 (05) : 38 - 52
  • [3] A New High-Performance Scalable Dynamic Interconnection for FPGA-based Reconfigurable Systems
    Jovanovic, Slavisa
    Tanougast, Camel
    Weber, Serge
    2008 INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, 2008, : 61 - 66
  • [4] FPGA-based High-Performance Parallel Architecture for Homomorphic Computing on Encrypted Data
    Roy, Sujoy Sinha
    Turan, Furkan
    Jarvinen, Kimmo
    Vercauteren, Frederik
    Verbauwhede, Ingrid
    2019 25TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE COMPUTER ARCHITECTURE (HPCA), 2019, : 387 - 398
  • [5] High-Performance FPGA-Based CNN Accelerator With Block-Floating-Point Arithmetic
    Lian, Xiaocong
    Liu, Zhenyu
    Song, Zhourui
    Dai, Jiwu
    Zhou, Wei
    Ji, Xiangyang
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2019, 27 (08) : 1874 - 1885
  • [6] An FPGA-based high-performance wireless vibration analyzer
    Shahzad, Khurram
    Oelmann, Bengt
    2013 NORCHIP, 2013,
  • [7] A High-performance FPGA-based Accelerator for Gradient Compression
    Ren, Qingqing
    Zhu, Shuyong
    Meng, Xuying
    Zhang, Yujun
    DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 429 - 438
  • [8] High-performance FPGA-based general reduction methods
    Morris, GR
    Zhuo, L
    Prasanna, VK
    FCCM 2005: 13TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2005, : 323 - 324
  • [9] FPGA-Based High-Performance Network Impairment Emulator
    Duan, Dexuan
    Wang, Xinshuo
    Li, Lin
    Liu, Lei
    ELECTRONICS, 2024, 13 (24):
  • [10] High-performance FPGA-based implementation of Kalman filter
    Lee, CR
    Salcic, Z
    MICROPROCESSORS AND MICROSYSTEMS, 1997, 21 (04) : 257 - 265