Design Optimization for High-Performance Computing Using FPGA

被引:0
|
作者
Isik, Murat [1 ]
Inadagbo, Kayode [2 ]
Aktas, Hakan [3 ]
机构
[1] Drexel Univ, Elect & Comp Engn Dept, Philadelphia, PA 19104 USA
[2] A&M Univ, Elect & Comp Engn Dept, Prairie View, TX USA
[3] Omer Halisdemir Univ, Comp Engn Dept, Nigde, Turkiye
来源
INFORMATION MANAGEMENT AND BIG DATA, SIMBIG 2023 | 2024年 / 2142卷
关键词
High-performance computing; Tensil AI; Design optimization; FPGA; Open-source inference accelerator;
D O I
10.1007/978-3-031-63616-5_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs have not been widely used for high-performance computing, primarily because of their programming complexity and difficulties in optimizing performance. We optimize Tensil AI's open-source inference accelerator for maximum performance using ResNet20 trained on CIFAR in this paper in order to gain insight into the use of FPGAs for high-performance computing. In this paper, we show how improving hardware design, using Xilinx Ultra RAM, and using advanced compiler strategies can lead to improved inference performance. We also demonstrate that running the CIFAR test data set shows very little accuracy drop when rounding down from the original 32bit floating point. The heterogeneous computing model in our platform allows us to achieve a frame rate of 293.58 frames per second (FPS) and a %90 accuracy on a ResNet20 trained using CIFAR. The experimental results show that the proposed accelerator achieves a throughput of 21.12 Giga-Operations Per Second (GOP/s) with a 5.21W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency.
引用
收藏
页码:142 / 156
页数:15
相关论文
共 50 条
  • [21] The Design and Implementation of a High-performance Portfolio Optimization Platform
    Chen, Yidong
    Lu, Zhonghua
    Yang, Xueying
    2020 IEEE 23RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2020), 2020, : 1 - 7
  • [22] OmpSs@FPGA Framework for High Performance FPGA Computing
    Miguel de Haro, Juan
    Bosch, Jaume
    Filgueras, Antonio
    Vidal, Miquel
    Jimenez-Gonzalez, Daniel
    Alvarez, Carlos
    Martorell, Xavier
    Ayguade, Eduard
    Labarta, Jesus
    IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (12) : 2029 - 2042
  • [23] TRENDS IN HIGH-PERFORMANCE COMPUTING
    Kindratenko, Volodymyr
    Trancoso, Pedro
    COMPUTING IN SCIENCE & ENGINEERING, 2011, 13 (03) : 92 - 95
  • [24] High-performance computing today
    Dongarra, J
    Meuer, H
    Simon, H
    Strohmaier, E
    FOUNDATIONS OF MOLECULAR MODELING AND SIMULATION, 2001, 97 (325): : 96 - 100
  • [25] The marketplace of high-performance computing
    Strohmaier, E
    Dongarra, JJ
    Meuer, HW
    Simon, HD
    PARALLEL COMPUTING, 1999, 25 (13-14) : 1517 - 1544
  • [26] High-Performance Computing with TeraStat
    Bompiani, Edoardo
    Petrillo, Umberto Ferraro
    Lasinio, Giovanna Jona
    Palini, Francesco
    2020 IEEE INTL CONF ON DEPENDABLE, AUTONOMIC AND SECURE COMPUTING, INTL CONF ON PERVASIVE INTELLIGENCE AND COMPUTING, INTL CONF ON CLOUD AND BIG DATA COMPUTING, INTL CONF ON CYBER SCIENCE AND TECHNOLOGY CONGRESS (DASC/PICOM/CBDCOM/CYBERSCITECH), 2020, : 499 - 506
  • [27] Challenges in High-Performance Computing
    Navaux P.O.A.
    Lorenzon A.F.
    Serpa M.S.
    Journal of the Brazilian Computer Society, 2023, 29 (01) : 51 - 62
  • [28] Evaluating High-Level Design Strategies on FPGAs for High-Performance Computing
    Podobas, Artur
    Zohouri, Hamid Reza
    Maruyama, Naoya
    Matsuoka, Satoshi
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [29] Evaluating High-Level Design Strategies on FPGAs for High-Performance Computing
    Podobas, Artur
    Zohouri, Hamid Reza
    Maruyama, Naoya
    Matsuoka, Satoshi
    2017 27TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2017,
  • [30] Efficient FPGA Implementation of OpenCL High-Performance Computing Applications via High-Level Synthesis
    Bin Muslim, Fahad
    Ma, Liang
    Roozmeh, Mehdi
    Lavagno, Luciano
    IEEE ACCESS, 2017, 5 : 2747 - 2762