FPGA Accelerator for Homomorphic Encrypted Sparse Convolutional Neural Network Inference

被引:0
|
作者
Yang, Yang [1 ]
Kuppannagari, Sanmukh R. [1 ]
Kannan, Rajgopal [2 ]
Prasanna, Viktor K. [1 ]
机构
[1] Univ Southern Calif, Dept Elect & Comp Engn, Los Angeles, CA 90089 USA
[2] US Army Res Lab, Adelphi, MD USA
来源
2022 IEEE 30TH INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2022) | 2022年
基金
美国国家科学基金会;
关键词
FPGA acceleration; homomorphic encryption; sparse neural networks; parallel computing;
D O I
10.1109/FCCM53951.2022.9786115
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Homomorphic Encryption (HE) is a promising solution to the increasing concerns of privacy in machine learning. But HE-based CNN inference remains impractically slow. Pruning can significantly reduce the compute and memory footprint of CNNs. However, homomorphic encrypted Sparse Convolutional Neural Networks (SCNN) have vastly different compute and memory characteristics compared with unencrypted SCNN. Simply extending the design principles of existing SCNN accelerators may offset the potential acceleration offered by sparsity. To realize fast execution, we propose an FPGA accelerator to speedup the computation of linear layers, the main computational bottleneck in HE SCNN batch inference. First, we analyze the memory requirements of various linear layers in HE SCNN and discuss the unique challenges. Motivated by the analysis, we present a novel dataflow specially designed to optimize HE SCNN data reuse coupled with an efficient scheduling policy that minimizes on-chip SRAM access conflicts. Leveraging the proposed dataflow and scheduling algorithm, we demonstrate the first end-to-end acceleration of HE SCNN batch inference targeting CPU-FPGA heterogeneous platforms. For a batch of 8K images, our design achieves up to 5.6x speedup in inference latency compared with the CPU-only solution for widely studied 6-layer and 11-layer HE CNNs.
引用
收藏
页码:81 / 89
页数:9
相关论文
共 50 条
  • [1] FPGA-Based Reconfigurable Convolutional Neural Network Accelerator Using Sparse and Convolutional Optimization
    Gowda, Kavitha Malali Vishveshwarappa
    Madhavan, Sowmya
    Rinaldi, Stefano
    Divakarachari, Parameshachari Bidare
    Atmakur, Anitha
    ELECTRONICS, 2022, 11 (10)
  • [2] An Efficient Convolutional Neural Network Accelerator on FPGA
    Si, Junye
    Jiang, Jianfei
    Wang, Qin
    Huang, Jia
    2018 14TH IEEE INTERNATIONAL CONFERENCE ON SOLID-STATE AND INTEGRATED CIRCUIT TECHNOLOGY (ICSICT), 2018, : 1392 - 1394
  • [3] Accelerator Design and Performance Modeling for Homomorphic Encrypted CNN Inference
    Ye, Tian
    Kannan, Rajgopal
    Prasanna, Viktor K.
    2020 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2020,
  • [4] An Efficient Accelerator Unit for Sparse Convolutional Neural Network
    Zhao, Yulin
    Wang, Donghui
    Wang, Leiou
    TENTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2018), 2018, 10806
  • [5] An Efficient Hardware Accelerator for Block Sparse Convolutional Neural Networks on FPGA
    Yin, Xiaodi
    Wu, Zhipeng
    Li, Dejian
    Shen, Chongfei
    Liu, Yu
    IEEE EMBEDDED SYSTEMS LETTERS, 2024, 16 (02) : 158 - 161
  • [6] Bandwidth Efficient Homomorphic Encrypted Matrix Vector Multiplication Accelerator on FPGA
    Yang, Yang
    Kuppannagari, Sanmukh R.
    Kannan, Rajgopal
    Prasanna, Viktor K.
    2022 21ST INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2022), 2022, : 1 - 9
  • [7] SDCNN: An Efficient Sparse Deconvolutional Neural Network Accelerator on FPGA
    Chang, Jung-Woo
    Kang, Keon-Woo
    Kang, Suk-Ju
    2019 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE), 2019, : 968 - 971
  • [8] A convolutional neural network accelerator on FPGA for crystallography spot screening
    Jiang, Yuwei
    Feng, Yingqi
    Ren, Tao
    Zhu, Yongxin
    PROCEEDINGS OF THE 2024 IEEE 10TH IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING, HPSC 2024, 2024, : 66 - 70
  • [9] Latency-Aware Inference on Convolutional Neural Network Over Homomorphic Encryption
    Ishiyama, Takumi
    Suzuki, Takuya
    Yamana, Hayato
    INFORMATION INTEGRATION AND WEB INTELLIGENCE, IIWAS 2022, 2022, 13635 : 324 - 337
  • [10] An FPGA-based Accelerator Platform Implements for Convolutional Neural Network
    Meng, Xiao
    Yu, Lixin
    Qin, Zhiyong
    2019 THE 3RD INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS (HP3C 2019), 2019, : 25 - 28