A Practical Measure of FPGA Floating Point Acceleration for High Performance Computing

被引:0
|
作者
Cappello, John D. [1 ]
Strenski, Dave [1 ]
机构
[1] Optimal Design Inc, Sewell, NJ USA
来源
PROCEEDINGS OF THE 2013 IEEE 24TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 13) | 2013年
关键词
FPGA; matrix multiplication; high performance computing; floating point arithmetic; multiply-accumulate; systolic array; hardware acceleration; GFLOPS; Xilinx; Virtex-7; DSP48; heavily-pipelined accumulators;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A key enabler for Field Programmable Gate Arrays (FPGAs) in High Performance Computing (HPC) has been the addition of hard arithmetic cores. These "slices of DSP" dedicated to accelerated number crunching allow FPGAs to deliver more computing muscle, especially for floating point algorithms. This paper compares how an FPGA's performance in a practical HPC application measures up to its theoretical capacity. The implementation of a floating point matrix multiplication algorithm based on a 12x12 MAC (Multiply-Accumulate) array targeting the Xilinx Virtex 7 XT family is described. Several design techniques were used to ensure uninterrupted systolic operation of the array throughout execution, including a novel approach to handling heavily pipelined accumulators, as well as a scheme for overcoming the inherent inefficiencies of DDR3 memory. The result is a sustained "practical" performance range of 144-180 GFLOPS, compared to the target device's "theoretical" range of 257-290 GFLOPS.
引用
收藏
页码:160 / 167
页数:8
相关论文
共 50 条
  • [41] OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing
    Kobayashi, Ryohei
    Oobata, Yuma
    Fujita, Norihisa
    Yamaguchi, Yoshiki
    Boku, Taisuke
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING IN ASIA-PACIFIC REGION (HPC ASIA 2018), 2018, : 192 - 201
  • [42] Efficient Implementation Of Single Precision Floating Point Processor In FPGA
    Lasith, K. K.
    Thomas, Anoop
    2014 ANNUAL INTERNATIONAL CONFERENCE ON EMERGING RESEARCH AREAS: MAGNETICS, MACHINES AND DRIVES (AICERA/ICMMD), 2014,
  • [43] A High-Performance Accelerator for Floating-Point Matrix Multiplication
    Jia, Xun
    Wu, Gunning
    Xie, Xianghui
    2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 396 - 402
  • [44] FPGA Optimizations for a Pipelined Floating-Point Exponential Unit
    Alachiotis, Nikolaos
    Stamatakis, Alexandros
    RECONFIGURABLE COMPUTING: ARCHITECTURES, TOOLS AND APPLICATIONS, 2011, 6578 : 316 - 327
  • [45] Design and Implementation of an Embedded FPGA Floating Point DSP Block
    Langhammer, Martin
    Pasca, Bogdan
    IEEE 22ND SYMPOSIUM ON COMPUTER ARITHMETIC ARITH 22, 2015, : 26 - 33
  • [46] Resource- and Power-Efficient High-Performance Object Detection Inference Acceleration Using FPGA
    Tesema, Solomon Negussie
    Bourennane, El-Bay
    ELECTRONICS, 2022, 11 (12)
  • [47] Hardware Realization of High-Speed Area-Efficient Floating Point Arithmetic Unit on FPGA
    Yacoub, Mohammed H.
    Ismail, Samar M.
    Said, Lobna A.
    2024 INTERNATIONAL CONFERENCE ON MACHINE INTELLIGENCE AND SMART INNOVATION, ICMISI 2024, 2024, : 190 - 193
  • [48] High-Level Languages and Floating-Point Arithmetic for FPGA-Based CFD Simulations
    Sanchez-Roman, Diego
    Sutter, Gustavo
    Lopez-Buedo, Sergio
    Gonzalez, Ivan
    Gomez-Arribas, Francisco J.
    Aracil, Javier
    Palacios, Francisco
    IEEE DESIGN & TEST OF COMPUTERS, 2011, 28 (04): : 28 - 36
  • [49] Data-Intensive Computing Acceleration with Python']Python in Xilinx FPGA
    Yang, Yalin
    Xu, Linjie
    Xu, Zichen
    Wang, Yuhao
    DATA QUALITY AND TRUST IN BIG DATA, 2019, 11235 : 111 - 124
  • [50] ConfAx: Exploiting Approximate Computing for Configurable FPGA CNN Acceleration at the Edge
    Korol, Guilherme
    Jordan, Michael Guilherme
    Rutzig, Mateus Beck
    Schneider Beck, Antonio Carlos
    2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 1650 - 1654