FPGA-Based High-Performance and Scalable Block LU Decomposition Architecture

被引:42
|
作者
Jaiswal, Manish Kumar [1 ]
Chandrachoodan, Nitin [2 ]
机构
[1] ICFAI Univ, Dehra Dun, India
[2] Indian Inst Technol, Dept Elect Engn, Madras 600036, Tamil Nadu, India
关键词
LU decomposition; block LU; FPGA; hardware acceleration; floating point arithmetics; single/double precision; scaling; ATLAS; Intel-MKL; GPU; LINEAR ALGEBRA; STABILITY;
D O I
10.1109/TC.2011.24
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Decomposition of a matrix into lower and upper triangular matrices (LU decomposition) is a vital part of many scientific and engineering applications, and the block LU decomposition algorithm is an approach well suited to parallel hardware implementation. This paper presents an approach to speed up implementation of the block LU decomposition algorithm using FPGA hardware. Unlike most previous approaches reported in the literature, the approach does not assume the matrix can be stored entirely on chip. The memory accesses are studied for various FPGA configurations, and a schedule of operations for scaling well is shown. The design has been synthesized for FPGA targets and can be easily retargeted. The design outperforms previous hardware implementations, as well as tuned software implementations including the ATLAS and MKL libraries on workstations.
引用
收藏
页码:60 / 72
页数:13
相关论文
共 50 条
  • [41] High-performance computing for SKA transient search: Use of FPGA-based accelerators
    Aafreen, R.
    Abhishek, R.
    Ajithkumar, B.
    Vaidyanathan, Arunkumar M.
    Barve, Indrajit V.
    Bhattramakki, Sahana
    Bhat, Shashank
    Girish, B. S.
    Ghalame, Atul
    Gupta, Y.
    Hayatnagarkar, Harshal G.
    Kamini, P. A.
    Karastergiou, A.
    Levin, L.
    Madhavi, S.
    Mekhala, M.
    Mickaliger, M.
    Mugundhan, V.
    Naidu, Arun
    Oppermann, J.
    Pandian, B. Arul
    Patra, N.
    Raghunathan, A.
    Roy, Jayanta
    Sethi, Shiv
    Shaw, B.
    Sherwin, K.
    Sinnen, O.
    Sinha, S. K.
    Srivani, K. S.
    Stappers, B.
    Subrahmanya, C. R.
    Prabu, Thiagaraj
    Vinutha, C.
    Wadadekar, Y. G.
    Wang, Haomiao
    Williams, C.
    JOURNAL OF ASTROPHYSICS AND ASTRONOMY, 2023, 44 (01)
  • [42] HashCache: High-Performance State Tracking for Resilient FPGA-based Packet Processing
    Offel, Michael
    Ley, Andreas
    Hager, Sven
    2023 33RD INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL, 2023, : 364 - 364
  • [43] High-performance computing for SKA transient search: Use of FPGA-based accelerators
    R. Aafreen
    R. Abhishek
    B. Ajithkumar
    Arunkumar M. Vaidyanathan
    Indrajit V. Barve
    Sahana Bhattramakki
    Shashank Bhat
    B. S. Girish
    Atul Ghalame
    Y. Gupta
    Harshal G. Hayatnagarkar
    P. A. Kamini
    A. Karastergiou
    L. Levin
    S. Madhavi
    M. Mekhala
    M. Mickaliger
    V. Mugundhan
    Arun Naidu
    J. Oppermann
    B. Arul Pandian
    N. Patra
    A. Raghunathan
    Jayanta Roy
    Shiv Sethi
    B. Shaw
    K. Sherwin
    O. Sinnen
    S. K. Sinha
    K. S. Srivani
    B. Stappers
    C. R. Subrahmanya
    Thiagaraj Prabu
    C. Vinutha
    Y. G. Wadadekar
    Haomiao Wang
    C. Williams
    Journal of Astrophysics and Astronomy, 44
  • [44] A high-performance FPGA-based BWA-MEM DNA sequence alignment
    Pham-Quoc, Cuong
    Kieu-Do, Binh
    Thinh, Tran Ngoc
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (02):
  • [45] HIGH-PERFORMANCE FPGA-BASED FLOATING-POINT ADDER WITH THREE INPUTS
    Guntoro, Andre
    Glesner, Manfred
    2008 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE AND LOGIC APPLICATIONS, VOLS 1 AND 2, 2008, : 626 - 629
  • [46] A High-Performance FPGA-based LDPC Decoder for Solid-State Drives
    Liu, Yanhuan
    Zhang, Chun
    Song, Pengcheng
    Jiang, Hanjun
    2017 IEEE 60TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2017, : 1232 - 1235
  • [47] A High-Performance and Accurate FPGA-Based Flow Monitor for 100 Gbps Networks
    Sha, Meng
    Guo, Zhichuan
    Wang, Ke
    Zeng, Xuewen
    ELECTRONICS, 2022, 11 (13)
  • [48] A High-Performance FPGA Accelerator for CUR Decomposition
    Abdelgawad, M. A. A.
    Cheung, Ray C. C.
    Yan, Hong
    2022 32ND INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE LOGIC AND APPLICATIONS, FPL, 2022, : 294 - 299
  • [49] Achieving high performance with FPGA-based computing
    Herbordt, Martin C.
    VanCourt, Tom
    Gu, Yongfeng
    Sukhwani, Bharat
    Conti, Al
    Model, Josh
    DiSabello, Doug
    COMPUTER, 2007, 40 (03) : 50 - +
  • [50] High performance FPGA-based image correlation
    Lindoso, Almudena
    Entrena, Luis
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2007, 2 (04) : 223 - 233