Deep compression of convolutional neural networks with low-rank approximation

被引:11
|
作者
Astrid, Marcella [1 ]
Lee, Seung-Ik [1 ,2 ]
机构
[1] Univ Sci & Technol, Dept Comp Software, Daejeon, South Korea
[2] Elect & Telecommun Res Inst, SW Contents Res Lab, Daejeon, South Korea
关键词
convolutional neural network; CP-decomposition; cyber physical system; model compression; singular value decomposition; tensor power method;
D O I
10.4218/etrij.2018-0065
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The application of deep neural networks (DNNs) to connect the world with cyber physical systems (CPSs) has attracted much attention. However, DNNs require a large amount of memory and computational cost, which hinders their use in the relatively low-end smart devices that are widely used in CPSs. In this paper, we aim to determine whether DNNs can be efficiently deployed and operated in low-end smart devices. To do this, we develop a method to reduce the memory requirement of DNNs and increase the inference speed, while maintaining the performance (for example, accuracy) close to the original level. The parameters of DNNs are decomposed using a hybrid of canonical polyadic-singular value decomposition, approximated using a tensor power method, and fine-tuned by performing iterative one-shot hybrid fine-tuning to recover from a decreased accuracy. In this study, we evaluate our method on frequently used networks. We also present results from extensive experiments on the effects of several fine-tuning methods, the importance of iterative fine-tuning, and decomposition techniques. We demonstrate the effectiveness of the proposed method by deploying compressed networks in smartphones.
引用
收藏
页码:421 / 434
页数:14
相关论文
共 50 条
  • [21] Linear low-rank approximation and nonlinear dimensionality reduction
    Zhang, ZY
    Zha, HY
    SCIENCE IN CHINA SERIES A-MATHEMATICS, 2004, 47 (06): : 908 - 920
  • [22] An effective low-rank compression with a joint rank selection followed by a compression-friendly training
    Eo, Moonjung
    Kang, Suhyun
    Rhee, Wonjong
    NEURAL NETWORKS, 2023, 161 : 165 - 177
  • [23] Deep Convolutional Neural Networks Compression Method Based on Linear Representation of Kernels
    Chen, Ruobing
    Chen, Yefei
    Su, Jianbo
    ELEVENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2018), 2019, 11041
  • [24] TEC-CNN: Toward Efficient Compressing of Convolutional Neural Nets with Low-rank Tensor Decomposition
    Wang, Yifan
    Feng, Liang
    Cai, Fenglin
    Li, Lusi
    Wu, Rui
    Li, Jie
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2025, 21 (02)
  • [25] Randomized low-rank approximation of parameter-dependent matrices
    Kressner, Daniel
    Lam, Hei Yin
    NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS, 2024, 31 (06)
  • [26] STREAMING LOW-RANK MATRIX APPROXIMATION WITH AN APPLICATION TO SCIENTIFIC SIMULATION
    Tropp, Joel A.
    Yurtsever, Alp
    Udell, Madeleine
    Cevher, Volkan
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2019, 41 (04) : A2430 - A2463
  • [27] Structured Low-Rank Approximation: Optimization on Matrix Manifold Approach
    Saha T.
    Khare S.
    International Journal of Applied and Computational Mathematics, 2021, 7 (6)
  • [28] Fusion of Deep Convolutional Neural Networks
    Suchy, Robert
    Ezekiel, Soundararajan
    Cornacchia, Maria
    2017 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR), 2017,
  • [29] Fpar: filter pruning via attention and rank enhancement for deep convolutional neural networks acceleration
    Chen, Yanming
    Wu, Gang
    Shuai, Mingrui
    Lou, Shubin
    Zhang, Yiwen
    An, Zhulin
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (07) : 2973 - 2985
  • [30] Universality of deep convolutional neural networks
    Zhou, Ding-Xuan
    APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2020, 48 (02) : 787 - 794