Learning Low-Rank Representations for Model Compression

被引：0

作者：

Zhu, Zezhou ^{[1
]}

Dong, Yuan ^{[1
]}

Zhao, Zhong ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommun, Sch Artificial Intelligence, Beijing, Peoples R China

[2] Huawei, Cent Media Technol Inst, Shenzhen, Peoples R China

来源：

2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年

关键词：

Deep learning; low-rank representation; vector quantization; model compression;

D O I：

10.1109/IJCNN54540.2023.10191936

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied, optimizations via the reduction of subvector dimensionality are not carefully considered. This paper reports our recent progress on model compression with the combination of dimensionality reduction and vector quantization, proposing Low-Rank Representation Vector Quantization (LR(2)VQ). LR(2)VQ joins low-rank representation with subvector clustering to construct a new kind of building block that is optimized by end-to-end training. In our method, the compression ratio could be directly controlled by the dimensionality of subvectors, and the final accuracy is solely determined by clustering dimensionality (d) over tilde. We recognize (d) over tilde as a trade-off between low-rank approximation error and clustering error and carry out both theoretical analysis and experimental observations that empower the estimation of the proper (d) over tilde before fine-tuning. With a proper (d) over tilde, we evaluate LR(2)VQ with ResNet-18/ResNet-50 on ImageNet classification datasets, achieving 2.8%/1.0% top-1 accuracy improvements over the current state-of-the-art model compression algorithms with 43x/31x compression factor.

引用

页数：9

共 30 条

[1] [Anonymous], 2015, IEEE I CONF COMP VIS, DOI DOI 10.1109/ICCV.2015.123
[2] Chen Wei, 2020, BMVC
[3] Cho M., 2022, INT C LEARN REPR
[4] Courbariaux M, 2015, ADV NEUR IN, V28
[5] Gong Y., 2014, ARXIV14126115
[6] Guo YW, 2016, ADV NEUR IN, V29
[7] Han S., 2016, Squeezenet: Alexnet-level accuracy with 50x fewer parameters and inverted exclamation0.5mb model size
[8] Han S, 2015, ADV NEUR IN, V28
[9] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
[10] Channel Pruning for Accelerating Very Deep Neural Networks
He, Yihui
Zhang, Xiangyu
Sun, Jian
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406

← 1 2 3 →