Reduced-Complexity Deep Neural Networks Design Using Multi-Level Compression

被引:6
作者
Liao, Siyu [1 ]
Xie, Yi [1 ]
Lin, Xue [2 ]
Wang, Yanzhi [3 ]
Zhang, Min [4 ]
Yuan, Bo [1 ]
机构
[1] CUNY, New York, NY 10031 USA
[2] Northeastern Univ, Boston, MA 02115 USA
[3] Syracuse Univ, Syracuse, NY 13244 USA
[4] Ford Motor Co, Dearborn, MI 48120 USA
来源
IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING | 2019年 / 4卷 / 02期
关键词
Deep neural network; reduced complexity; compression;
D O I
10.1109/TSUSC.2017.2710178
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Network has achieved great success in many fields. However, many DNN models are both deep and large thereby causing high storage and energy consumption during the training and inference phases. This paper proposes multi-level compression framework. By utilizing cross-layer parameter-reducing techniques ranging from structure compression to weight compression to representation compression, the proposed compression strategy can enable order-of-magnitude reduction in network size for both training and inference with negligible accuracy loss, thereby leading to very high-efficiency and high-accuracy DNN models. Experiments show that the proposed strategy can achieve around 1.8K compression ratio in terms of dense matrices and around 30x for the overall model.
引用
收藏
页码:245 / 251
页数:7
相关论文
共 24 条
[1]  
[Anonymous], 2012, Structured Matrices and Polynomials: Unified Superfast Algorithms
[2]   An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections [J].
Cheng, Yu ;
Yu, Felix X. ;
Feris, Rogerio S. ;
Kumar, Sanjiv ;
Choudhary, Alok ;
Chang, Shih-Fu .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2857-2865
[3]  
Ciresan D, 2012, PROC CVPR IEEE, P3642, DOI 10.1109/CVPR.2012.6248110
[4]  
Coates Adam, 2010, Ann Arbor, V1001, P2
[5]  
Courbariaux Matthieu, 2016, ARXIV160202830
[6]  
Denil Misha, 2013, Advances in Neural Information Processing Systems NeurIPS, P2148
[7]  
Denton E, 2014, ADV NEUR IN, V27
[8]  
Garnett Roman, 2015, Advances in Neural Information Processing Systems NeurIPS, P3088
[9]  
Gong Yunchao, 2014, ARXIV14126115
[10]  
Han S., 2016, INT C LEARN REPR