The Design of Efficient Data Flow and Low-Complexity Architecture for a Highly Configurable CNN Accelerator

被引:0
|
作者
Hui-Wen Liu
Chung-An Shen
机构
[1] National Taiwan University of Science and Technology,Department of Electronic and Computer Engineering
关键词
Convolutional neural network (CNN); Depthwise separable convolution (DSC); MobileNet; Configurable; Architecture; High throughput; Low complexity;
D O I
暂无
中图分类号
学科分类号
摘要
This paper presents a highly configurable and low-complexity CNN accelerator based on the MobileNetV3 model. To the best of authors’ knowledge, this is the first design of CNN accelerator based on the MobileNetV3 model. A highly efficient processing flow and memory-access scheme are proposed in this paper so that the throughput is greatly enhanced for the structural features in MobileNetV3 model. Furthermore, the proposed processing flow enhances the efficiency for the utilization of hardware components to reduce the complexity. Based on the proposed processing flow, this paper presents a highly configurable architecture to support various operation modes in MobileNetV3 model. The designed architecture is synthesized and layout with TSMC 90 nm technology. The evaluations for the performance and area complexity are conducted based on the post-layout estimations. It is shown in this paper that the performance of 197.7 FPS is achieved with the hardware complexity of 5392 KGEs for the MobileNetV3-Large. Compared to the state-of-the-art accelerator based on MobileNet, the FPS of the proposed design is improved by 3.4 × and the complexity is reduced by 18%.
引用
收藏
页码:4759 / 4783
页数:24
相关论文
共 50 条
  • [1] The Design of Efficient Data Flow and Low-Complexity Architecture for a Highly Configurable CNN Accelerator
    Liu, Hui-Wen
    Shen, Chung-An
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2023, 42 (08) : 4759 - 4783
  • [2] Low-Complexity Classification Technique and Hardware-Efficient Classify-Unit Architecture for CNN Accelerator
    Islam, Md Najrul
    Shrestha, Rahul
    Chowdhury, Shubhajit Roy
    PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 210 - 215
  • [3] A Power-Efficient Configurable Low-Complexity MIMO Detector
    Huang, Chien-Jen
    Yu, Chung-Wen
    Ma, Hsi-Pin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2009, 56 (02) : 485 - 496
  • [4] VLSI Architecture for Configurable and Low-Complexity Design of Hard-Decision Viterbi Decoding Algorithm
    Putra, Rachmad Vidya Wicaksana
    Adiono, Trio
    JOURNAL OF ICT RESEARCH AND APPLICATIONS, 2016, 10 (01) : 57 - 75
  • [5] Design and implementation of dual-mode configurable memory architecture for CNN accelerator
    山蕊
    LI Xiaoshuo
    GAO Xu
    HUO Ziqing
    High Technology Letters, 2024, 30 (02) : 211 - 220
  • [6] Design and implementation of dual-mode configurable memory architecture for CNN accelerator
    Shan, Rui
    Li, Xiaoshuo
    Gao, Xu
    Huo, Ziqing
    High Technology Letters, 2024, 30 (02) : 211 - 220
  • [7] The VLSI Architecture and Implementation of a Low Complexity and Highly Efficient Configurable SVD Processor for MIMO Communication Systems
    Chen, Wei-Jhe
    Lai, Yu-An
    Shen, Chung-An
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2020, 39 (12) : 6231 - 6246
  • [8] The VLSI Architecture and Implementation of a Low Complexity and Highly Efficient Configurable SVD Processor for MIMO Communication Systems
    Wei-Jhe Chen
    Yu-An Lai
    Chung-An Shen
    Circuits, Systems, and Signal Processing, 2020, 39 : 6231 - 6246
  • [9] Algorithm and architecture design for a low-complexity adaptive equalizer
    Chen, CN
    Chen, KH
    Chiueh, TD
    PROCEEDINGS OF THE 2003 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL II: COMMUNICATIONS-MULTIMEDIA SYSTEMS & APPLICATIONS, 2003, : 304 - 307
  • [10] Highly efficient architecture of newhope-nist on fpga using low-complexity ntt/intt
    Zhang N.
    Yang B.
    Chen C.
    Yin S.
    Wei S.
    Liu L.
    IACR Transactions on Cryptographic Hardware and Embedded Systems, 2020, 2020 (02): : 49 - 72