BATUDE: Budget-Aware Neural Network Compression Based on Tucker Decomposition

被引:0
|
作者
Yin, Miao [1 ]
Phan, Huy [1 ]
Zang, Xiao [1 ]
Liao, Siyu [2 ,3 ]
Yuan, Bo [1 ]
机构
[1] Rutgers State Univ, Dept Elect & Comp Engn, New Brunswick, NJ 08901 USA
[2] Amazon, Seattle, WA USA
[3] Rutgers State Univ, New Brunswick, NJ USA
来源
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年
基金
美国国家科学基金会;
关键词
ALGORITHM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Model compression is very important for the efficient deployment of deep neural network (DNN) models on resource-constrained devices. Among various model compression approaches, high-order tensor decomposition is particularly attractive and useful because the decomposed model is very small and fully structured. For this category of approaches, tensor ranks are the most important hyper-parameters that directly determine the architecture and task performance of the compressed DNN models. However, as an NP-hard problem, selecting optimal tensor ranks under the desired budget is very challenging and the state-of-the-art studies suffer from unsatisfied compression performance and timing-consuming search procedures. To systematically address this fundamental problem, in this paper we propose BATUDE, a Budget-Aware TUcker DEcomposition-based compression approach that can efficiently calculate optimal tensor ranks via one-shot training. By integrating the rank selecting procedure to the DNN training process with a specified compression budget, the tensor ranks of the DNN models are learned from the data and thereby bringing very significant improvement on both compression ratio and classification accuracy for the compressed models. The experimental results on ImageNet dataset show that our method enjoys 0.33% top-5 higher accuracy with 2.52x less computational cost as compared to the uncompressed ResNet-18 model. For ResNet-50, the proposed approach enables 0.37% and 0.55% top-5 accuracy increase with 2.97 x and 2.04 x computational cost reduction, respectively, over the uncompressed model.
引用
收藏
页码:8874 / 8882
页数:9
相关论文
共 50 条
  • [1] WEIGHT REPARAMETRIZATION FOR BUDGET-AWARE NETWORK PRUNING
    Dupont, Robin
    Sahbi, Hichem
    Michel, Guillaume
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 789 - 793
  • [2] Structured Pruning of Neural Networks with Budget-Aware Regularization
    Lemaire, Carl
    Achkar, Andrew
    Jodoin, Pierre-Marc
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9100 - 9108
  • [3] Deep neural network compression by Tucker decomposition with nonlinear response
    Liu, Ye
    Ng, Michael K.
    KNOWLEDGE-BASED SYSTEMS, 2022, 241
  • [4] Budget-aware Role Based Access Control
    Salim, Farzad
    Reid, Jason
    Dulleck, Uwe
    Dawson, Ed
    COMPUTERS & SECURITY, 2013, 35 : 37 - 50
  • [5] An Accuracy-Preserving Neural Network Compression via Tucker Decomposition
    Liu, Can
    Xie, Kun
    Wen, Jigang
    Xie, Gaogang
    Li, Kenli
    IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING, 2025, 10 (02): : 262 - 273
  • [6] Compression of hyperspectral images based on Tucker decomposition and CP decomposition
    Yang, Lei
    Zhou, Jinsong
    Jing, Juanjuan
    Wei, Lidong
    Li, Yacan
    He, Xiaoying
    Feng, Lei
    Nie, Boyang
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2022, 39 (10) : 1815 - 1822
  • [7] Structured precision skipping: Accelerating convolutional neural networks with budget-aware dynamic precision selection
    Huang, Kai
    Chen, Siang
    Li, Bowen
    Claesen, Luc
    Yao, Hao
    Chen, Junjian
    Jiang, Xiaowen
    Liu, Zhili
    Xiong, Dongliang
    JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 124
  • [8] Neural Network Compression Based on Tensor Ring Decomposition
    Xie, Kun
    Liu, Can
    Wang, Xin
    Li, Xiaocan
    Xie, Gaogang
    Wen, Jigang
    Li, Kenli
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 36 (03) : 1 - 15
  • [9] PAPS: Power budget-Aware Pipeline Scheduling for an Embedded ReRAM-based Accelerator
    Shuai, Changchi
    Qiu, Keni
    SEC'19: PROCEEDINGS OF THE 4TH ACM/IEEE SYMPOSIUM ON EDGE COMPUTING, 2019, : 352 - 353
  • [10] Nested compression of convolutional neural networks with Tucker-2 decomposition
    Zdunek, Rafal
    Gabor, Mateusz
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,