Low-Complexity Classification Technique and Hardware-Efficient Classify-Unit Architecture for CNN Accelerator

被引:2
|
作者
Islam, Md Najrul [1 ]
Shrestha, Rahul [1 ]
Chowdhury, Shubhajit Roy [1 ]
机构
[1] Indian Inst Technol Mandi, Sch Comp & Elect Engn, Mandi, Himachal Prades, India
关键词
Convolutional neural network (CNN); very large scale integration (VLSI); digital VLSI-architecture design; field programmable gate array (FPGA); fully depleted silicon on insulator (FDSOI); and application specific integrated circuit (ASIC);
D O I
10.1109/VLSID60093.2024.00041
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes simplified classification technique to reduce the complexity of softmax-based classification in the convolutional neural network (CNN) inference engine/accelerator. It primarily allows the CNN accelerator to directly classify the object from the activation of fully connected (FC) layer that avoids complex exponential and divisive operations. Corresponding to the suggested technique, this work also presents a hardware-efficient VLSI architecture of classify unit for CNN accelerator. Furthermore, the proposed classify-unit architecture has been ASIC synthesized and post-layout simulated in 28 nm-FDSOI technology node. As a result, our design delivers a peak throughput of 2.5 GIPS with a hardware efficiency of 5.05x10(3) GIPS/mW/mm(2). Comparison of these results with the relevant reported works indicates that the proposed classify unit manifests 24.1x lesser area and 12.5x better hardware efficiency than the state-of-the-art implementations. Finally, complete CNN accelerator that is integrated with the proposed classify unit has been functionally validated with the aid of Zynq UltraScale+ ZCU102 FPGA-board in real-world scenario, using the MobileNet-V2 CNN model.
引用
收藏
页码:210 / 215
页数:6
相关论文
共 50 条
  • [21] Low-Complexity Wireless Technique Classification With Multifeature Fusion Broad Learning Network
    Peng, Yang
    Zhang, Yibin
    Huang, Hao
    Wang, Yu
    Liu, Pengfei
    Lin, Yun
    Gui, Guan
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (21): : 34434 - 34442
  • [22] Efficient and low-complexity hardware architecture of Gaussian normal basis multiplication over GF(2m) for elliptic curve cryptosystems
    Rashidi, Bahram
    Sayedi, Sayed Masoud
    Farashahi, Reza Rezaeian
    IET CIRCUITS DEVICES & SYSTEMS, 2017, 11 (02) : 103 - 112
  • [23] A Low-Complexity Turbo Decoder Architecture for Energy-Efficient Wireless Sensor Networks
    Li, Liang
    Maunder, Robert G.
    Al-Hashimi, Bashir M.
    Hanzo, Lajos
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2013, 21 (01) : 14 - 22
  • [24] Low Complexity, Hardware-Efficient Neighbor-Guided SGM Optical Flow for Low-Power Mobile Vision Applications
    Li, Ziyun
    Xiang, Jiang
    Gong, Luyao
    Blaauw, David
    Chakrabarti, Chaitali
    Kim, Hun Seok
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (07) : 2191 - 2204
  • [25] FPGA-based low-complexity high-throughput real-time hardware accelerator for robust watermarking
    Ge, Hangqi
    Sha, Jin
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2019, 16 (04) : 813 - 820
  • [26] FPGA-based low-complexity high-throughput real-time hardware accelerator for robust watermarking
    Hangqi Ge
    Jin Sha
    Journal of Real-Time Image Processing, 2019, 16 : 813 - 820
  • [27] Fast and Efficient Transcoding Based on Low-Complexity Background Modeling and Adaptive Block Classification
    Zhang, Xianguo
    Huang, Tiejun
    Tian, Yonghong
    Geng, Mingchao
    Ma, Siwei
    Gao, Wen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2013, 15 (08) : 1769 - 1785
  • [28] ESRU: Extremely Low-Bit and Hardware-Efficient Stochastic Rounding Unit Design for Low-Bit DNN Training
    Chang, Sung-En
    Yuan, Geng
    Lu, Alec
    Sun, Mengshu
    Li, Yanyu
    Ma, Xiaolong
    Li, Zhengang
    Xie, Yanyue
    Qin, Minghai
    Lin, Xue
    Fang, Zhenman
    Wang, Yanzhi
    2023 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION, DATE, 2023,
  • [29] Power efficient MPEG-4 decoder architecture featuring low-complexity error resilience
    Byun, HI
    Jeon, MY
    Seo, JY
    Lee, KW
    Lee, SH
    Kang, BH
    2002 IEEE ASIA-PACIFIC CONFERENCE ON ASIC PROCEEDINGS, 2002, : 225 - 228
  • [30] Low-Complexity Hardware Architecture of Traffic Sign Recognition with IHSL color space for Advanced Driver Assistance Systems
    Lee, Sang-Seol
    Lee, Eunchong
    Hwang, Youngbae
    Jang, Sung-Joon
    2016 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS-ASIA (ICCE-ASIA), 2016,