Design Optimization for High-Performance Computing Using FPGA

被引:0
|
作者
Isik, Murat [1 ]
Inadagbo, Kayode [2 ]
Aktas, Hakan [3 ]
机构
[1] Drexel Univ, Elect & Comp Engn Dept, Philadelphia, PA 19104 USA
[2] A&M Univ, Elect & Comp Engn Dept, Prairie View, TX USA
[3] Omer Halisdemir Univ, Comp Engn Dept, Nigde, Turkiye
来源
INFORMATION MANAGEMENT AND BIG DATA, SIMBIG 2023 | 2024年 / 2142卷
关键词
High-performance computing; Tensil AI; Design optimization; FPGA; Open-source inference accelerator;
D O I
10.1007/978-3-031-63616-5_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs have not been widely used for high-performance computing, primarily because of their programming complexity and difficulties in optimizing performance. We optimize Tensil AI's open-source inference accelerator for maximum performance using ResNet20 trained on CIFAR in this paper in order to gain insight into the use of FPGAs for high-performance computing. In this paper, we show how improving hardware design, using Xilinx Ultra RAM, and using advanced compiler strategies can lead to improved inference performance. We also demonstrate that running the CIFAR test data set shows very little accuracy drop when rounding down from the original 32bit floating point. The heterogeneous computing model in our platform allows us to achieve a frame rate of 293.58 frames per second (FPS) and a %90 accuracy on a ResNet20 trained using CIFAR. The experimental results show that the proposed accelerator achieves a throughput of 21.12 Giga-Operations Per Second (GOP/s) with a 5.21W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency.
引用
收藏
页码:142 / 156
页数:15
相关论文
共 50 条
  • [41] A Practical Measure of FPGA Floating Point Acceleration for High Performance Computing
    Cappello, John D.
    Strenski, Dave
    PROCEEDINGS OF THE 2013 IEEE 24TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP 13), 2013, : 160 - 167
  • [42] Optimization of High-Performance Computing Job Scheduling Based on Offline Reinforcement Learning
    Li, Shihao
    Dai, Wei
    Chen, Yongyan
    Liang, Bo
    APPLIED SCIENCES-BASEL, 2024, 14 (23):
  • [43] High performance computing for multidisciplinary design optimization and robustness of vehicle structures
    Kodiyalam, S
    COMPUTATIONAL FLUID AND SOLID MECHANICS 2003, VOLS 1 AND 2, PROCEEDINGS, 2003, : 2305 - 2307
  • [44] High-Performance Computing MRI Simulations
    Stoecker, Tony
    Vahedipour, Kaveh
    Pflugfelder, Daniel
    Shah, N. Jon
    MAGNETIC RESONANCE IN MEDICINE, 2010, 64 (01) : 186 - 193
  • [45] The Growth of High-Performance Computing in Africa
    Amolo, George O.
    COMPUTING IN SCIENCE & ENGINEERING, 2018, 20 (03) : 21 - 24
  • [46] Taming complexity in high-performance computing
    Oldehoeft, R
    MATHEMATICS AND COMPUTERS IN SIMULATION, 2000, 54 (4-5) : 341 - 357
  • [47] Autotuning in High-Performance Computing Applications
    Balaprakash, Prasanna
    Dongarra, Jack
    Gamblin, Todd
    Hall, Mary
    Hollingsworth, Jeffrey K.
    Norris, Boyana
    Vuduc, Richard
    PROCEEDINGS OF THE IEEE, 2018, 106 (11) : 2068 - 2083
  • [48] High-performance computing in image registration
    Zanin, Michele
    Remondino, Fabio
    Dalla Mura, Mauro
    HIGH-PERFORMANCE COMPUTING IN REMOTE SENSING II, 2012, 8539
  • [49] The promise of high-performance reconfigurable computing
    El-Ghazawi, Tarek
    El-Araby, Esam
    Huang, Miaoqing
    Gaj, Kris
    Kindratenko, Volodymyr
    Buell, Duncan
    COMPUTER, 2008, 41 (02) : 69 - +
  • [50] HIGH-PERFORMANCE COMPUTING ON WALL STREET
    Spiers, Brad
    Wallez, Denis
    COMPUTER, 2010, 43 (12) : 53 - 59