Hybrid and non-uniform quantization methods using retro synthesis data for efficient inference

被引:1
|
作者
Pratap, Tej G. V. S. L. [1 ]
Kumar, Raja [1 ]
Pradeep, N. S. [1 ]
机构
[1] Samsung Res Inst, On Device AI, Bangalore, Karnataka, India
关键词
data free quantization; quantization; DNN inference; synthetic data; model compression;
D O I
10.1109/IJCNN52387.2021.9533724
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing quantization aware training methods and post-training quantization methods attempt to compensate the quantization loss by leveraging on training data. Hence, these methods are not effective for privacy constraint applications as they are tightly coupled with training data. In contrast, this paper proposes a data-independent post-training quantization scheme that eliminates the need for training data. This is achieved by generating a faux dataset, hereafter referred to as 'Retro-Synthesis Data', from the FP32 model layer statistics and further using it for quantization. This approach outperformed state-of-the-art methods including, but not limited to, ZeroQ and DFQ on models with and without Batch-Normalization layers for 8, 6, and 4 bit precisions on ImageNet and CIFAR-10 datasets. We also introduced two futuristic variants of posttraining quantization methods namely 'Hybrid Quantization' and 'Non-Uniform Quantization'. The Hybrid Quantization scheme determines the sensitivity of each layer for per-tensor & perchannel quantization, and thereby generates hybrid quantized models that are '10 to 20%' more efficient in inference time while achieving the same or better accuracy as compared to perchannel quantization scheme. Also, this method outperformed FP32 accuracy when applied for ResNet-18, and ResNet-50 models on the ImageNet dataset. In the proposed Non-Uniform Quantization scheme, the weights are grouped into different clusters and these clusters are assigned with a varied number of quantization steps depending on the number of weights and their ranges in the respective cluster. This method resulted in '1%' accuracy improvement against state-of-the-art methods on the ImageNet dataset.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Fault-Tolerant Synthesis using Non-Uniform Redundancy
    Woo, Keven L.
    Guthaus, Matthew R.
    2009 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, 2009, : 213 - 218
  • [32] Indexing non-uniform spatial data
    Kanth, KVR
    Agrawal, D
    ElAbbadi, A
    Singh, AK
    IDEAS '97 - INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1997, : 289 - 298
  • [33] Efficient Non-uniform Channelization for SDR Using Frequency Domain Filtering
    Jiang, Tian-li
    Gong, Ke-xian
    Peng, Hua
    Wu, Di
    2014 5TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2014, : 731 - 735
  • [34] Excitation of azimuthally non-uniform hybrid waves
    Lenivenko, VA
    ELECTROMAGNETIC COMPATIBILITY 1996 - THIRTEENTH INTERNATIONAL WROCLAW SYMPOSIUM, 1996, : 89 - 92
  • [35] Research of Defects in Non-uniform Mechanical Systems Using Dynamic Methods
    Bucinskas, V.
    Sutinys, E.
    MECHANIKA 2010: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE, 2010, : 96 - 99
  • [36] Efficient Non-Uniform Pilot Design for TDCS
    Chang, Cheng
    Feng, Lina
    Zhou, Hui
    Zhao, Zilong
    Gu, Xin
    SENSORS, 2021, 21 (20)
  • [37] Non-uniform DNN Structured Subnets Sampling for Dynamic Inference
    Yang, Li
    He, Zhezhi
    Cao, Yu
    Fan, Deliang
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [38] Improved SLIC Superpixel Segmentation Based on HSV Non-uniform Quantization
    Li, Dongping
    Liu, Changliang
    PROCEEDINGS OF THE 6TH INTERNATIONAL CONFERENCE ON INFORMATION ENGINEERING FOR MECHANICS AND MATERIALS, 2016, 97 : 301 - 305
  • [39] NLIC: Non-Uniform Quantization-Based Learned Image Compression
    Ge, Ziqing
    Ma, Siwei
    Gao, Wen
    Pan, Jingshan
    Jia, Chuanmin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (10) : 9647 - 9663
  • [40] Non-Uniform Quantization of Successive Cancellation List Decoder for Polar Codes
    Dong, Yanfei
    Niu, Kai
    Dong, Chao
    2020 IEEE 31ST ANNUAL INTERNATIONAL SYMPOSIUM ON PERSONAL, INDOOR AND MOBILE RADIO COMMUNICATIONS (IEEE PIMRC), 2020,