Deep compression of convolutional neural networks with low-rank approximation

被引:11
|
作者
Astrid, Marcella [1 ]
Lee, Seung-Ik [1 ,2 ]
机构
[1] Univ Sci & Technol, Dept Comp Software, Daejeon, South Korea
[2] Elect & Telecommun Res Inst, SW Contents Res Lab, Daejeon, South Korea
关键词
convolutional neural network; CP-decomposition; cyber physical system; model compression; singular value decomposition; tensor power method;
D O I
10.4218/etrij.2018-0065
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The application of deep neural networks (DNNs) to connect the world with cyber physical systems (CPSs) has attracted much attention. However, DNNs require a large amount of memory and computational cost, which hinders their use in the relatively low-end smart devices that are widely used in CPSs. In this paper, we aim to determine whether DNNs can be efficiently deployed and operated in low-end smart devices. To do this, we develop a method to reduce the memory requirement of DNNs and increase the inference speed, while maintaining the performance (for example, accuracy) close to the original level. The parameters of DNNs are decomposed using a hybrid of canonical polyadic-singular value decomposition, approximated using a tensor power method, and fine-tuned by performing iterative one-shot hybrid fine-tuning to recover from a decreased accuracy. In this study, we evaluate our method on frequently used networks. We also present results from extensive experiments on the effects of several fine-tuning methods, the importance of iterative fine-tuning, and decomposition techniques. We demonstrate the effectiveness of the proposed method by deploying compressed networks in smartphones.
引用
收藏
页码:421 / 434
页数:14
相关论文
共 50 条
  • [1] Stable Low-Rank CP Decomposition for Compression of Convolutional Neural Networks Based on Sensitivity
    Yang, Chenbin
    Liu, Huiyi
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [2] Sequence Discriminative Training for Low-Rank Deep Neural Networks
    Tachioka, Yuuki
    Watanabe, Shinji
    Le Roux, Jonathan
    Hershey, John R.
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 572 - 576
  • [3] Compression of Deep Neural Networks by combining pruning and low rank decomposition
    Goyal, Saurabh
    Choudhury, Anamitra Roy
    Sharma, Vivek
    2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 952 - 958
  • [4] FINE CONTEXT, LOW-RANK, SOFTPLUS DEEP NEURAL NETWORKS FOR MOBILE SPEECH RECOGNITION
    Senior, Andrew
    Lei, Xin
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [5] Fast and Robust Compression of Deep Convolutional Neural Networks
    Wen, Jia
    Yang, Liu
    Shen, Chenyang
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 52 - 63
  • [6] Recurrent Neural Network Compression Based on Low-Rank Tensor Representation
    Tjandra, Andros
    Sakti, Sakriani
    Nakamura, Satoshi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (02) : 435 - 449
  • [7] Rank-based pooling for deep convolutional neural networks
    Shi, Zenglin
    Ye, Yangdong
    Wu, Yunpeng
    NEURAL NETWORKS, 2016, 83 : 21 - 31
  • [8] Sparse low rank factorization for deep neural network compression
    Swaminathan, Sridhar
    Garg, Deepak
    Kannan, Rajkumar
    Andres, Frederic
    NEUROCOMPUTING, 2020, 398 : 185 - 196
  • [9] ON A PROBLEM OF WEIGHTED LOW-RANK APPROXIMATION OF MATRICES
    Dutta, Aritra
    Li, Xin
    SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 2017, 38 (02) : 530 - 553
  • [10] Learning Low-Rank Representations for Model Compression
    Zhu, Zezhou
    Dong, Yuan
    Zhao, Zhong
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,