Design Optimization for High-Performance Computing Using FPGA

被引:0
|
作者
Isik, Murat [1 ]
Inadagbo, Kayode [2 ]
Aktas, Hakan [3 ]
机构
[1] Drexel Univ, Elect & Comp Engn Dept, Philadelphia, PA 19104 USA
[2] A&M Univ, Elect & Comp Engn Dept, Prairie View, TX USA
[3] Omer Halisdemir Univ, Comp Engn Dept, Nigde, Turkiye
来源
INFORMATION MANAGEMENT AND BIG DATA, SIMBIG 2023 | 2024年 / 2142卷
关键词
High-performance computing; Tensil AI; Design optimization; FPGA; Open-source inference accelerator;
D O I
10.1007/978-3-031-63616-5_11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reconfigurable architectures like Field Programmable Gate Arrays (FPGAs) have been used for accelerating computations in several domains because of their unique combination of flexibility, performance, and power efficiency. However, FPGAs have not been widely used for high-performance computing, primarily because of their programming complexity and difficulties in optimizing performance. We optimize Tensil AI's open-source inference accelerator for maximum performance using ResNet20 trained on CIFAR in this paper in order to gain insight into the use of FPGAs for high-performance computing. In this paper, we show how improving hardware design, using Xilinx Ultra RAM, and using advanced compiler strategies can lead to improved inference performance. We also demonstrate that running the CIFAR test data set shows very little accuracy drop when rounding down from the original 32bit floating point. The heterogeneous computing model in our platform allows us to achieve a frame rate of 293.58 frames per second (FPS) and a %90 accuracy on a ResNet20 trained using CIFAR. The experimental results show that the proposed accelerator achieves a throughput of 21.12 Giga-Operations Per Second (GOP/s) with a 5.21W on-chip power consumption at 100 MHz. The comparison results with off-the-shelf devices and recent state-of-the-art implementations illustrate that the proposed accelerator has obvious advantages in terms of energy efficiency.
引用
收藏
页码:142 / 156
页数:15
相关论文
共 50 条
  • [31] Design optimization methods for high-performance research reactor core design
    Betzler, Benjamin R.
    Chandler, David
    Cook, David H.
    Davidson, Eva E.
    Ilas, Germina
    NUCLEAR ENGINEERING AND DESIGN, 2019, 352
  • [32] Extending High-Level Synthesis with High-Performance Computing Performance Visualization
    Huthmann, Jens
    Podobas, Artur
    Sommer, Lukas
    Koch, Andreas
    Sano, Kentaro
    2020 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER 2020), 2020, : 371 - 380
  • [33] High-performance computing for the optimization of high-pressure thermal treatments in food industry
    M. R. Ferrández
    S. Puertas-Martín
    J. L. Redondo
    B. Ivorra
    A. M. Ramos
    P. M. Ortigosa
    The Journal of Supercomputing, 2019, 75 : 1187 - 1202
  • [34] High-performance computing for the optimization of high-pressure thermal treatments in food industry
    Ferrandez, M. R.
    Puertas-Martin, S.
    Redondo, J. L.
    Ivorra, B.
    Ramos, A. M.
    Ortigosa, P. M.
    JOURNAL OF SUPERCOMPUTING, 2019, 75 (03): : 1187 - 1202
  • [35] FinOps-driven optimization of cloud resource usage for high-performance computing using machine learning
    Nawrocki, Piotr
    Smendowski, Mateusz
    JOURNAL OF COMPUTATIONAL SCIENCE, 2024, 79
  • [36] Efficient Bio-molecules Sequencing Using Multi-Objective Optimization and High-Performance Computing
    Yadav, Sohan K.
    Jha, S. K.
    Singh, Sudhakar
    Dixit, Pratibha
    Prakash, Shiv
    WIRELESS PERSONAL COMMUNICATIONS, 2024, 134 (03) : 1783 - 1800
  • [37] A Multicore Architecture for High-Performance Scientific Computing using FPGAs
    Cobos Carrascosa, J. P.
    Aparicio del Moral, B.
    Ramos, J. L.
    Lopez Jimenez, A. C.
    del Toro Iniesta, J. C.
    2014 IEEE 8TH INTERNATIONAL SYMPOSIUM ON EMBEDDED MULTICORE/MANYCORE SOCS (MCSOC), 2014, : 223 - 228
  • [38] Efficient algorithm design on hybrid CPU-FPGA architecture for high performance computing
    Jean Shilpa V.
    Jawahar P.K.
    International Journal of Systems, Control and Communications, 2021, 12 (01) : 28 - 45
  • [39] Enabling FPGA and AI Engine Tasks in the HPX Programming Framework for Heterogeneous High-Performance Computing
    Kalkhof, Torben
    Heinz, Carsten
    Koch, Andreas
    APPLIED RECONFIGURABLE COMPUTING. ARCHITECTURES, TOOLS, AND APPLICATIONS, ARC 2024, 2024, 14553 : 75 - 89
  • [40] A Heterogeneous Platform with GPU and FPGA for Power Efficient High Performance Computing
    Wu, Qiang
    Ha, Yajun
    Kumar, Akash
    Luo, Shaobo
    Li, Ang
    Mohamed, Shihab
    2014 14TH INTERNATIONAL SYMPOSIUM ON INTEGRATED CIRCUITS (ISIC), 2014, : 220 - 223