aHCQ: Adaptive Hierarchical Clustering Based Quantization Framework for Deep Neural Networks

被引:0
|
作者
Hu, Jiaxin [1 ]
Rao, Weixiong [1 ]
Zhao, Qinpei [1 ]
机构
[1] Tongji Univ, Sch Software Engn, Shanghai, Peoples R China
来源
ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PAKDD 2021, PT II | 2021年 / 12713卷
基金
中国国家自然科学基金; 上海市自然科学基金;
关键词
Deep neural network; Hierarchical clustering; Network quantization; Compression rate;
D O I
10.1007/978-3-030-75765-6_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
For deep neural networks (DNNs), a high model accuracy is usually the main focus. However, millions of model parameters commonly lead to high space overheads, especially parameter redundancy. By maintaining network weights with less bit-widths, network quantization has been used to compress DNNs for lower space costs. However, existing quantization methods cannot well optimally balance the model size and the accuracy, thus they suffer from the accuracy loss more or less. Besides, though few of existing quantization techniques can adaptively determine layers quantization bit-widths, they either give little consideration on the relations of different DNN layers, or are designed for special hardware environment that are not universal in broad computer fields. To overcome these issues, we propose an adaptive Hierarchical Clustering based Quantization (aHCQ) framework. The aHCQ can find a largely compressed model from the quantization of each layer and take only little loss on the model accuracy. It is shown from the experiments that the aHCQ can achieve 11.4x and 8.2x model compression rates with only around 0.5% drop of the model accuracy.
引用
收藏
页码:207 / 218
页数:12
相关论文
共 50 条
  • [1] ALPS: Adaptive Quantization of Deep Neural Networks with GeneraLized PositS
    Langroudi, Hamed F.
    Karia, Vedant
    Carmichael, Zachariah
    Zyarah, Abdullah
    Pandit, Tej
    Gustafson, John L.
    Kudithipudi, Dhireesha
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3094 - 3103
  • [2] GAQ-SNN: A Genetic Algorithm based Quantization Framework for Deep Spiking Neural Networks
    Nguyen, Duy-Anh
    Tran, Xuan-Tu
    Iacopi, Francesca
    2022 INTERNATIONAL CONFERENCE ON IC DESIGN AND TECHNOLOGY (ICICDT), 2022, : 93 - 96
  • [3] QD-Compressor: a Quantization-based Delta Compression Framework for Deep Neural Networks
    Zhang, Shuyu
    Wu, Donglei
    Jin, Haoyu
    Zou, Xiangyu
    Xia, Wen
    Huang, Xiaojia
    2021 IEEE 39TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2021), 2021, : 542 - 550
  • [4] Behavior Prediction Scheme Using Hierarchical Clustering and Deep Neural Networks
    Altameem, Arwa A.
    Hafez, Alaaeldin M.
    JOURNAL OF NANOELECTRONICS AND OPTOELECTRONICS, 2022, 17 (05) : 861 - 872
  • [5] Adaptive Quantization for Deep Neural Network
    Zhou, Yiren
    Moosavi-Dezfooli, Seyed-Mohsen
    Cheung, Ngai-Man
    Frossard, Pascal
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 4596 - 4604
  • [6] Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
    Jagtap, Ameya D.
    Shin, Yeonjong
    Kawaguchi, Kenji
    Karniadakis, George Em
    NEUROCOMPUTING, 2022, 468 (165-180) : 165 - 180
  • [7] Robust Quantization of Deep Neural Networks
    Kim, Youngseok
    Lee, Junyeol
    Kim, Younghoon
    Seo, Jiwon
    PROCEEDINGS OF THE 29TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '20), 2020, : 74 - 84
  • [8] Adaptive image restoration based on hierarchical neural networks
    Yap, KH
    Guan, L
    OPTICAL ENGINEERING, 2000, 39 (07) : 1877 - 1890
  • [9] Weighted-Entropy-based Quantization for Deep Neural Networks
    Park, Eunhyeok
    Ahn, Junwhan
    Yoo, Sungjoo
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 7197 - 7205
  • [10] Dataflow-based Joint Quantization for Deep Neural Networks
    Geng, Xue
    Fu, Jie
    Zhao, Bin
    Lin, Jie
    Aly, Mohamed M. Sabry
    Pal, Christopher
    Chandrasekhar, Vijay
    2019 DATA COMPRESSION CONFERENCE (DCC), 2019, : 574 - 574