EGC: Entropy-based gradient compression for distributed deep learning

被引：10

作者：

Xiao, Danyang ^{[1
,2
]}

Mei, Yuan ^{[1
,2
]}

Kuang, Di ^{[1
,2
]}

Chen, Mengqiang ^{[1
,3
]}

Guo, Binbin ^{[1
,3
]}

Wu, Weigang ^{[1
,2
,3
]}

机构：

[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China

[2] Guangdong Prov Key Lab Big Data Anal & Proc, Guangzhou, Peoples R China

[3] MoE Key Lab Machine Intelligence & Adv Comp, Guangzhou, Peoples R China

来源：

INFORMATION SCIENCES | 2021年 / 548卷

基金：

中国国家自然科学基金;

关键词：

Deep learning; Distributed training; Entropy; Gradient compression; Neural networks; OPTIMIZATION; MODEL; SPMV;

D O I：

10.1016/j.ins.2020.05.121

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

With the increase of volume of training data and scale of network models, distributed deep learning is becoming more and more popular, which employs multiple workers to train a single model. However, communication among workers has been always a major challenge, because it may cause large time latency and bandwidth consumption. In this paper, we propose an entropy-based gradient compression (EGC) mechanism to reduce communication overhead. EGC selects the gradients communicated based on the entropy of the gradient items, which can achieve a high compression ratio without sacrificing accuracy. More importantly, EGC is a general and flexible mechanism that can be adopted in different distributed training algorithms. Accordingly, we propose three EGC-based training algorithms for different scenarios, i.e., EGC-DSGD for decentralized training, EGC-PS for centralized training, and EGC-FL for federated training. To improve the accuracy of these algorithms, we also adopt associated mechanisms, including automatic learning rate correction, momentum correction and residuals accumulation. We prove the convergence of EGC by analysis and evaluate its performance by experiments. Eight models are trained using popular public datasets (including MNIST, CIFAR-10, Tiny ImageNet and Penn Treebank) for the tasks of image classification and language modeling. The experimental results show that, compared with existing works, the EGC based algorithms can achieve roughly 1000 times gradient compression ratio while keeping the accuracy similar or even higher. (c) 2020 Elsevier Inc. All rights reserved.

引用

页码：118 / 134

页数：17

共 50 条

[1] Enhancing Gradient Compression for Distributed Deep Learning
Bai, Zhe
Yu, Enda
Dong, Dezun
Lu, Pingjing
PROCEEDINGS OF THE 8TH ASIA-PACIFIC WORKSHOP ON NETWORKING, APNET 2024, 2024, : 171 - 172
[2] Learned Gradient Compression for Distributed Deep Learning
Abrahamyan, Lusine
Chen, Yiming
Bekoulis, Giannis
Deligiannis, Nikos
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7330 - 7344
[3] Standard Deviation Based Adaptive Gradient Compression For Distributed Deep Learning
Chen, Mengqiang
Yan, Zijie
Ren, Jiangtao
Wu, Weigang
2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 529 - 538
[4] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
Zhang, Lin
Zhang, Longteng
Shi, Shaohuai
Chu, Xiaowen
Li, Bo
2023 IEEE 43RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, ICDCS, 2023, : 361 - 371
[5] Entropy-based Deep Product Quantization for Visual Search and Deep Feature Compression
Niu, Benben
Wei, Ziwei
He, Yun
2021 INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2021,
[6] Deep Face Model Compression Using Entropy-Based Filter Selection
Han, Bingbing
Zhang, Zhihong
Xu, Chuanyu
Wang, Beizhan
Hu, Guosheng
Bai, Lu
Hong, Qingqi
Hancock, Edwin R.
IMAGE ANALYSIS AND PROCESSING (ICIAP 2017), PT II, 2017, 10485 : 127 - 136
[7] Entropy-Based Prioritized Sampling in Deep Q-Learning
Ramicic, Mirza
Bonarini, Andrea
2017 2ND INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2017), 2017, : 1068 - 1072
[8] SGC: Similarity-Guided Gradient Compression for Distributed Deep Learning
Liu, Jingling
Huang, Jiawei
Li, Yijun
Li, Zhaoyi
Lyu, Wenjun
Jiang, Wenchao
Wang, Jianxin
2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
[9] Tail: An Automated and Lightweight Gradient Compression Framework for Distributed Deep Learning
Guo, Jinrong
Hu, Songlin
Wang, Wang
Yao, Chunrong
Han, Jizhong
Li, Ruixuan
Lu, Yijun
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[10] Deep Learning and Entropy-Based Texture Features for Color Image Classification
Lhermitte, Emma
Hilal, Mirvana
Furlong, Ryan
O'Brien, Vincent
Humeau-Heurtier, Anne
ENTROPY, 2022, 24 (11)

← 1 2 3 4 5 →