Learning Low-Rank Representations for Model Compression

被引:0
作者
Zhu, Zezhou [1 ]
Dong, Yuan [1 ]
Zhao, Zhong [2 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing, Peoples R China
[2] Huawei, Cent Media Technol Inst, Shenzhen, Peoples R China
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
Deep learning; low-rank representation; vector quantization; model compression;
D O I
10.1109/IJCNN54540.2023.10191936
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied, optimizations via the reduction of subvector dimensionality are not carefully considered. This paper reports our recent progress on model compression with the combination of dimensionality reduction and vector quantization, proposing Low-Rank Representation Vector Quantization (LR(2)VQ). LR(2)VQ joins low-rank representation with subvector clustering to construct a new kind of building block that is optimized by end-to-end training. In our method, the compression ratio could be directly controlled by the dimensionality of subvectors, and the final accuracy is solely determined by clustering dimensionality (d) over tilde. We recognize (d) over tilde as a trade-off between low-rank approximation error and clustering error and carry out both theoretical analysis and experimental observations that empower the estimation of the proper (d) over tilde before fine-tuning. With a proper (d) over tilde, we evaluate LR(2)VQ with ResNet-18/ResNet-50 on ImageNet classification datasets, achieving 2.8%/1.0% top-1 accuracy improvements over the current state-of-the-art model compression algorithms with 43x/31x compression factor.
引用
收藏
页数:9
相关论文
共 30 条
  • [1] [Anonymous], 2015, IEEE I CONF COMP VIS, DOI DOI 10.1109/ICCV.2015.123
  • [2] Chen Wei, 2020, BMVC
  • [3] Cho M., 2022, INT C LEARN REPR
  • [4] Courbariaux M, 2015, ADV NEUR IN, V28
  • [5] Gong Y., 2014, ARXIV14126115
  • [6] Guo YW, 2016, ADV NEUR IN, V29
  • [7] Han S., 2016, Squeezenet: Alexnet-level accuracy with 50x fewer parameters and inverted exclamation0.5mb model size
  • [8] Han S, 2015, ADV NEUR IN, V28
  • [9] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [10] Channel Pruning for Accelerating Very Deep Neural Networks
    He, Yihui
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406