QD-Compressor: a Quantization-based Delta Compression Framework for Deep Neural Networks

被引：5

作者：

Zhang, Shuyu ^{[1
]}

Wu, Donglei ^{[1
]}

Jin, Haoyu ^{[1
]}

Zou, Xiangyu ^{[1
]}

Xia, Wen ^{[1
,2
,3
]}

Huang, Xiaojia ^{[1
]}

机构：

[1] Harbin Inst Technol, Shenzhen, Peoples R China

[2] Peng Cheng Lab, Cyberspace Secur Res Ctr, Shenzhen, Peoples R China

[3] Chinese Acad Sci, State Key Lab Comp Architecture, Inst Comp Technol, Beijing, Peoples R China

来源：

2021 IEEE 39TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2021) | 2021年

基金：

中国国家自然科学基金;

关键词：

Neural networks; Delta compression; Quantization;

D O I：

10.1109/ICCD53106.2021.00088

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural networks (DNNs) have achieved remarkable success in many fields. Large-scale DNNs also bring storage challenges when storing snapshots for preventing clusters' frequent failures, and bring massive internet traffic when dispatching or updating DNNs for resource-constrained devices (e.g., IoT devices, mobile phones). Several approaches are aiming to compress DNNs. The Recent work, Delta-DNN, notices high similarity existed in DNNs and thus calculates differences between them for improving the compression ratio. However, we observe that Delta-DNN, applying traditional global lossy quantization technique in calculating differences of two neighboring versions of the DNNs, can not fully exploit the data similarity between them for delta compression. This is because the parameters' value ranges (and also the delta data in Delta-DNN) are varying among layers in DNNs, which inspires us to propose a local-sensitive quantization scheme: the quantizers are adaptive to parameters' local value ranges in layers. Moreover, instead of quantizing differences of DNNs in Delta-DNN, our approach quantizes DNNs before calculating differences to make the differences more compressible. Besides, we also propose an error feedback mechanism to reduce DNNs' accuracy loss caused by the lossy quantization. Therefore, we design a novel quantization-based delta compressor called QD-Compressor, which calculates the lossy differences between epochs of DNNs for saving storage cost of backing up DNNs' snapshots and internet traffic of dispatching DNNs for resource-constrained devices. Experiments on several popular DNNs and datasets show that QD-Compressor obtains a compression ratio of 2.4x similar to 31.5x higher than the state-of-the-art approaches while well maintaining the model's test accuracy.

引用

页码：542 / 550

页数：9

共 30 条

[1] [Anonymous], 2015, P 3 INT C LEARN REPR
[2] [Anonymous], 2015, Distilling the knowledge in a neural network
[3] [Anonymous], 2019, P ICLR
[4] [Anonymous], 2020, P ICPP
[5] The Tail at Scale
Dean, Jeffrey
Barroso, Luiz Andre
[J]. COMMUNICATIONS OF THE ACM, 2013, 56 (02) : 74 - 80
[6] Lessons Learned From the Analysis of System Failures at Petascale: The Case of Blue Waters
Di Martino, Catello
Kalbarczyk, Zbigniew
Iyer, Ravishankar K.
Baccanico, Fabio
Fullop, Joseph
Kramer, William
[J]. 2014 44TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2014, : 610 - 621
[7] Rich feature hierarchies for accurate object detection and semantic segmentation
Girshick, Ross
Donahue, Jeff
Darrell, Trevor
Malik, Jitendra
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587
[8] Guo YW, 2016, ADV NEUR IN, V29
[9] Failures in Large Scale Systems: Long-term Measurement, Analysis, and Implications
Gupta, Saurabh
Patel, Tirthak
Engelmann, Christian
Tiwari, Devesh
[J]. SC'17: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2017,
[10] Deep Residual Learning for Image Recognition
He, Kaiming
Zhang, Xiangyu
Ren, Shaoqing
Sun, Jian
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778

← 1 2 3 →