Hardware-friendly Activation Functions for HybridViT Models

被引:0
作者
Kang, Beom Jin [1 ]
Kim, Nam Joon [1 ]
Lee, Jong Ho [1 ]
Kim, Hyun [1 ]
机构
[1] Seoul Natl Univ Sci & Technol, Res Ctr Elect & Informat Technol, Dept Elect & Informat Engn, Seoul, South Korea
来源
2023 20TH INTERNATIONAL SOC DESIGN CONFERENCE, ISOCC | 2023年
关键词
Vision Transformer; Convolutional neural network; Activation function; Quantization;
D O I
10.1109/ISOCC59558.2023.10396294
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, CNN+ViT hybrid models have shown promising performance in computer vision tasks. To implement the CNN+ViT Hybrid model in resource-limited devices, various studies have been ongoing to address issues of parameter size and computational complexity through quantization, aiming to enable hardware-friendly low-bit integer operations. However, commonly used ViT activation functions (e.g., GeLU, Swish) inevitably require floating-point operations. To address this problem, some studies have been conducted to approximate these functions with alternatives that allow integer operations. Inspired by the Shift-GeLU approach, which approximates the GeLU function to enable integer operations, we propose and evaluate the Shift-Swish function on the MobileViT model at both software and hardware levels. Experimental results show that the hardware-level RTL design of the proposed method can reduce LUT by 63.25%, FF usage by 87.69%, and power consumption by 46.57% with a minimum accuracy drop of 0.6% compared to the baseline.
引用
收藏
页码:147 / 148
页数:2
相关论文
共 10 条
[1]   Hardware-Friendly Logarithmic Quantization with Mixed-Precision for MobileNetV2 [J].
Choi, Dahun ;
Kim, Hyun .
2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, :348-351
[2]   Hardware-friendly Log-scale Quantization for CNNs with Activation Functions Containing Negative Values [J].
Choi, Dahun ;
Kim, Hyun .
18TH INTERNATIONAL SOC DESIGN CONFERENCE 2021 (ISOCC 2021), 2021, :415-416
[3]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[4]  
Gholami A., 2022, P LOW POW COMP VIS L, P291
[5]  
Hendrycks D, 2020, Arxiv, DOI arXiv:1606.08415
[6]   Zero-Centered Fixed-Point Quantization With Iterative Retraining for Deep Convolutional Neural Network-Based Object Detectors [J].
Kim, Sungrae ;
Kim, Hyun .
IEEE ACCESS, 2021, 9 :20828-20839
[7]  
Li ZK, 2022, Arxiv, DOI arXiv:2207.01405
[8]  
Mehta S, 2022, Arxiv, DOI arXiv:2110.02178
[9]   A Low-Cost and High-Throughput FPGA Implementation of the Retinex Algorithm for Real-Time Video Enhancement [J].
Park, Jin Woo ;
Lee, Hyokeun ;
Kim, Boyeal ;
Kang, Dong-Goo ;
Jin, Seung Oh ;
Kim, Hyun ;
Lee, Hyuk-Jae .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2020, 28 (01) :101-114
[10]  
Ramachandran P, 2017, Arxiv, DOI arXiv:1710.05941