A Practical Measure of FPGA Floating Point Acceleration for High Performance Computing

被引:0
|
作者
Cappello, John D. [1 ]
Strenski, Dave [1 ]
机构
[1] Optimal Design Inc, Sewell, NJ USA
来源
PROCEEDINGS OF THE 2013 IEEE 24TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 13) | 2013年
关键词
FPGA; matrix multiplication; high performance computing; floating point arithmetic; multiply-accumulate; systolic array; hardware acceleration; GFLOPS; Xilinx; Virtex-7; DSP48; heavily-pipelined accumulators;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A key enabler for Field Programmable Gate Arrays (FPGAs) in High Performance Computing (HPC) has been the addition of hard arithmetic cores. These "slices of DSP" dedicated to accelerated number crunching allow FPGAs to deliver more computing muscle, especially for floating point algorithms. This paper compares how an FPGA's performance in a practical HPC application measures up to its theoretical capacity. The implementation of a floating point matrix multiplication algorithm based on a 12x12 MAC (Multiply-Accumulate) array targeting the Xilinx Virtex 7 XT family is described. Several design techniques were used to ensure uninterrupted systolic operation of the array throughout execution, including a novel approach to handling heavily pipelined accumulators, as well as a scheme for overcoming the inherent inefficiencies of DDR3 memory. The result is a sustained "practical" performance range of 144-180 GFLOPS, compared to the target device's "theoretical" range of 257-290 GFLOPS.
引用
收藏
页码:160 / 167
页数:8
相关论文
共 50 条
  • [31] An optimized floating-point matrix multiplication on FPGA
    Zhang, T., 1832, Asian Network for Scientific Information (12): : 1832 - 1838
  • [32] Evaluation of a Floating-Point Intensive Kernel on FPGA
    Jin, Zheming
    Finkel, Hal
    Yoshii, Kazutomo
    Cappello, Franck
    EURO-PAR 2017: PARALLEL PROCESSING WORKSHOPS, 2018, 10659 : 664 - 675
  • [33] DSP48E Efficient Floating Point Multiplier Architectures on FPGA
    Jaiswal, Manish Kumar
    So, Hayden K. -H
    2017 30TH INTERNATIONAL CONFERENCE ON VLSI DESIGN AND 2017 16TH INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (VLSID 2017), 2017, : 455 - 460
  • [34] Area-Efficient FPGA Implementation of Quadruple Precision Floating Point Multiplier
    Jaiswal, Manish Kumar
    Cheung, Ray C. C.
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 376 - 382
  • [35] Open source high performance floating-point modules
    Hemmert, K. Scott
    Underwood, Keith D.
    FCCM 2006: 14TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2006, : 349 - +
  • [36] High-performance computing for SKA transient search: Use of FPGA-based accelerators
    Aafreen, R.
    Abhishek, R.
    Ajithkumar, B.
    Vaidyanathan, Arunkumar M.
    Barve, Indrajit V.
    Bhattramakki, Sahana
    Bhat, Shashank
    Girish, B. S.
    Ghalame, Atul
    Gupta, Y.
    Hayatnagarkar, Harshal G.
    Kamini, P. A.
    Karastergiou, A.
    Levin, L.
    Madhavi, S.
    Mekhala, M.
    Mickaliger, M.
    Mugundhan, V.
    Naidu, Arun
    Oppermann, J.
    Pandian, B. Arul
    Patra, N.
    Raghunathan, A.
    Roy, Jayanta
    Sethi, Shiv
    Shaw, B.
    Sherwin, K.
    Sinnen, O.
    Sinha, S. K.
    Srivani, K. S.
    Stappers, B.
    Subrahmanya, C. R.
    Prabu, Thiagaraj
    Vinutha, C.
    Wadadekar, Y. G.
    Wang, Haomiao
    Williams, C.
    JOURNAL OF ASTROPHYSICS AND ASTRONOMY, 2023, 44 (01)
  • [37] High-performance computing for SKA transient search: Use of FPGA-based accelerators
    R. Aafreen
    R. Abhishek
    B. Ajithkumar
    Arunkumar M. Vaidyanathan
    Indrajit V. Barve
    Sahana Bhattramakki
    Shashank Bhat
    B. S. Girish
    Atul Ghalame
    Y. Gupta
    Harshal G. Hayatnagarkar
    P. A. Kamini
    A. Karastergiou
    L. Levin
    S. Madhavi
    M. Mekhala
    M. Mickaliger
    V. Mugundhan
    Arun Naidu
    J. Oppermann
    B. Arul Pandian
    N. Patra
    A. Raghunathan
    Jayanta Roy
    Shiv Sethi
    B. Shaw
    K. Sherwin
    O. Sinnen
    S. K. Sinha
    K. S. Srivani
    B. Stappers
    C. R. Subrahmanya
    Thiagaraj Prabu
    C. Vinutha
    Y. G. Wadadekar
    Haomiao Wang
    C. Williams
    Journal of Astrophysics and Astronomy, 44
  • [38] Energy-Efficient Algebra Kernels in FPGA for High Performance Computing
    Favaro, Federico
    Dufrechou, Ernesto
    Ezzatti, Pablo
    Oliver, Juan P.
    JOURNAL OF COMPUTER SCIENCE & TECHNOLOGY, 2021, 21 (02): : 80 - 92
  • [39] A Practical Approach to Overcome Glitches in Achieving High Performance Computing
    Muhiddin, Shaik Khaja
    Yalavarthi, Suresh Babu
    Shekar, D. V. Chandra
    PROCEEDINGS OF THE 10TH INDIACOM - 2016 3RD INTERNATIONAL CONFERENCE ON COMPUTING FOR SUSTAINABLE GLOBAL DEVELOPMENT, 2016, : 464 - 469
  • [40] Acceleration of Radio Direction Finder Algorithm in FPGA Computing Platform
    Tomikowski, Piotr
    Mazurek, Gustaw
    2022 23RD INTERNATIONAL RADAR SYMPOSIUM (IRS), 2022, : 279 - 282