Low-Complexity Classification Technique and Hardware-Efficient Classify-Unit Architecture for CNN Accelerator

被引:2
|
作者
Islam, Md Najrul [1 ]
Shrestha, Rahul [1 ]
Chowdhury, Shubhajit Roy [1 ]
机构
[1] Indian Inst Technol Mandi, Sch Comp & Elect Engn, Mandi, Himachal Prades, India
关键词
Convolutional neural network (CNN); very large scale integration (VLSI); digital VLSI-architecture design; field programmable gate array (FPGA); fully depleted silicon on insulator (FDSOI); and application specific integrated circuit (ASIC);
D O I
10.1109/VLSID60093.2024.00041
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes simplified classification technique to reduce the complexity of softmax-based classification in the convolutional neural network (CNN) inference engine/accelerator. It primarily allows the CNN accelerator to directly classify the object from the activation of fully connected (FC) layer that avoids complex exponential and divisive operations. Corresponding to the suggested technique, this work also presents a hardware-efficient VLSI architecture of classify unit for CNN accelerator. Furthermore, the proposed classify-unit architecture has been ASIC synthesized and post-layout simulated in 28 nm-FDSOI technology node. As a result, our design delivers a peak throughput of 2.5 GIPS with a hardware efficiency of 5.05x10(3) GIPS/mW/mm(2). Comparison of these results with the relevant reported works indicates that the proposed classify unit manifests 24.1x lesser area and 12.5x better hardware efficiency than the state-of-the-art implementations. Finally, complete CNN accelerator that is integrated with the proposed classify unit has been functionally validated with the aid of Zynq UltraScale+ ZCU102 FPGA-board in real-world scenario, using the MobileNet-V2 CNN model.
引用
收藏
页码:210 / 215
页数:6
相关论文
共 50 条
  • [31] Highly efficient architecture of newhope-nist on fpga using low-complexity ntt/intt
    Zhang N.
    Yang B.
    Chen C.
    Yin S.
    Wei S.
    Liu L.
    IACR Transactions on Cryptographic Hardware and Embedded Systems, 2020, 2020 (02): : 49 - 72
  • [32] Low-complexity, energy-efficient fully parallel split-radix FFT architecture
    Hazarika, Jinti
    Ahamed, Shaik Rafi
    Nemade, Harshal B.
    ELECTRONICS LETTERS, 2022, 58 (18) : 678 - 680
  • [33] Hardware-Efficient and Low Sensing-Time VLSI-Architecture of MED based Spectrum Sensor for Cognitive Radio
    Chaurasiya, Rohit B.
    Shrestha, Rahul
    2019 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2019,
  • [34] Prototype of Low Complexity CNN Hardware Accelerator with FPGA-based PYNQ Platform for Dual-Mode Biometrics Recognition
    Chen, Yu-Hsiang
    Fan, Chih-Peng
    Chang, Robert Chen-Hao
    2020 17TH INTERNATIONAL SOC DESIGN CONFERENCE (ISOCC 2020), 2020, : 189 - 190
  • [35] ARBiS: A Hardware-Efficient SRAM CIM CNN Accelerator With Cyclic-Shift Weight Duplication and Parasitic-Capacitance Charge Sharing for AI Edge Application
    Zhao, Chenyang
    Fang, Jinbei
    Jiang, Jingwen
    Xue, Xiaoyong
    Zeng, Xiaoyang
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2023, 70 (01) : 364 - 377
  • [36] A low-complexity demodulation technique for spectrally efficient FDM systems using decision-feedback
    Yu, Baoxian
    Zhang, Han
    Dai, Xianhua
    IET COMMUNICATIONS, 2017, 11 (15) : 2386 - 2392
  • [37] Efficient Combining Technique with A Low-Complexity Detect-and-Forward Relay for Cooperative Diversity Scheme
    Abd Aziz, Azlan
    Iwanami, Yasunori
    Okamoto, Eiji
    TENCON 2009 - 2009 IEEE REGION 10 CONFERENCE, VOLS 1-4, 2009, : 123 - 128
  • [38] SCENIC: An Area and Energy-Efficient CNN-based Hardware Accelerator for Discernable Classification of Brain Pathologies using MRI
    Naidu, Bodepu Sai Tirumala
    Biswas, Shreya
    Chatterjee, Rounak
    Mandal, Sayak
    Pratihar, Srijan
    Chatterjee, Ayan
    Raha, Arnab
    Mukherjee, Amitava
    Paluh, Janet
    2022 35TH INTERNATIONAL CONFERENCE ON VLSI DESIGN (VLSID 2022) HELD CONCURRENTLY WITH 2022 21ST INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS (ES 2022), 2022, : 168 - 173
  • [39] Low-complexity Machine Learning Architecture for Hardware-aware True Random Number Generators Assessment and Continuous Monitoring
    Spinelli, F.
    Moretti, R.
    Addabbo, T.
    Vitolo, P.
    Licciardo, G. D.
    2023 18TH CONFERENCE ON PH.D RESEARCH IN MICROELECTRONICS AND ELECTRONICS, PRIME, 2023, : 221 - 224
  • [40] Selective Gray-Coded Bit-Plane Based Low-Complexity Motion Estimation and its Hardware Architecture
    Yavuz, Seda
    Celebi, Anil
    Aslam, Muhammad
    Urhan, Oguzhan
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2016, 62 (01) : 76 - 84