Filter-based deep-compression with global average pooling for convolutional networks

被引：91

作者：

Hsiao, Ting-Yun ^{[1
]}

Chang, Yung-Chang ^{[1
]}

Chou, Hsin-Hung ^{[1
]}

Chiu, Ching-Te ^{[1
]}

机构：

[1] Natl Tsing Hua Univ, Hsinchu 30013, Taiwan

来源：

JOURNAL OF SYSTEMS ARCHITECTURE | 2019年 / 95卷

关键词：

Deep convolutional model compression; Global average pooling (GAP); Pruning; Truncated SVD; Quantization;

D O I：

10.1016/j.sysarc.2019.02.008

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural networks are powerful, but using these networks is both memory and time consuming due to their numerous parameters and large amounts of computation. Many studies have been conducted on compressing the models on the parameter-level as well as on the bit-level. Here, we propose an efficient strategy to compress on the layers that are computation or memory consuming. We compress the model by introducing global average pooling, performing iterative pruning on the filters with the proposed order-deciding scheme in order to prune more efficiently, applying truncated SVD to the fully-connected layer, and performing quantization. Experiments on the VGG16 model show that our approach achieves a 60.9 x compression ratio in off-line storage with about 0.848% and 0.1378% loss of accuracy in the top-1 and top-5 classification results, respectively, with the validation dataset of ILSVRC2012. Our approach also shows good compression results on AlexNet and faster R-CNN.

引用

页码：9 / 18

页数：10