EGC: Entropy-based gradient compression for distributed deep learning

被引:10
|
作者
Xiao, Danyang [1 ,2 ]
Mei, Yuan [1 ,2 ]
Kuang, Di [1 ,2 ]
Chen, Mengqiang [1 ,3 ]
Guo, Binbin [1 ,3 ]
Wu, Weigang [1 ,2 ,3 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou, Peoples R China
[2] Guangdong Prov Key Lab Big Data Anal & Proc, Guangzhou, Peoples R China
[3] MoE Key Lab Machine Intelligence & Adv Comp, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep learning; Distributed training; Entropy; Gradient compression; Neural networks; OPTIMIZATION; MODEL; SPMV;
D O I
10.1016/j.ins.2020.05.121
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the increase of volume of training data and scale of network models, distributed deep learning is becoming more and more popular, which employs multiple workers to train a single model. However, communication among workers has been always a major challenge, because it may cause large time latency and bandwidth consumption. In this paper, we propose an entropy-based gradient compression (EGC) mechanism to reduce communication overhead. EGC selects the gradients communicated based on the entropy of the gradient items, which can achieve a high compression ratio without sacrificing accuracy. More importantly, EGC is a general and flexible mechanism that can be adopted in different distributed training algorithms. Accordingly, we propose three EGC-based training algorithms for different scenarios, i.e., EGC-DSGD for decentralized training, EGC-PS for centralized training, and EGC-FL for federated training. To improve the accuracy of these algorithms, we also adopt associated mechanisms, including automatic learning rate correction, momentum correction and residuals accumulation. We prove the convergence of EGC by analysis and evaluate its performance by experiments. Eight models are trained using popular public datasets (including MNIST, CIFAR-10, Tiny ImageNet and Penn Treebank) for the tasks of image classification and language modeling. The experimental results show that, compared with existing works, the EGC based algorithms can achieve roughly 1000 times gradient compression ratio while keeping the accuracy similar or even higher. (c) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页码:118 / 134
页数:17
相关论文
共 50 条
  • [1] Enhancing Gradient Compression for Distributed Deep Learning
    Bai, Zhe
    Yu, Enda
    Dong, Dezun
    Lu, Pingjing
    PROCEEDINGS OF THE 8TH ASIA-PACIFIC WORKSHOP ON NETWORKING, APNET 2024, 2024, : 171 - 172
  • [2] Learned Gradient Compression for Distributed Deep Learning
    Abrahamyan, Lusine
    Chen, Yiming
    Bekoulis, Giannis
    Deligiannis, Nikos
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7330 - 7344
  • [3] Standard Deviation Based Adaptive Gradient Compression For Distributed Deep Learning
    Chen, Mengqiang
    Yan, Zijie
    Ren, Jiangtao
    Wu, Weigang
    2020 20TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING (CCGRID 2020), 2020, : 529 - 538
  • [4] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning
    Zhang, Lin
    Zhang, Longteng
    Shi, Shaohuai
    Chu, Xiaowen
    Li, Bo
    2023 IEEE 43RD INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, ICDCS, 2023, : 361 - 371
  • [5] Entropy-based Deep Product Quantization for Visual Search and Deep Feature Compression
    Niu, Benben
    Wei, Ziwei
    He, Yun
    2021 INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2021,
  • [6] Deep Face Model Compression Using Entropy-Based Filter Selection
    Han, Bingbing
    Zhang, Zhihong
    Xu, Chuanyu
    Wang, Beizhan
    Hu, Guosheng
    Bai, Lu
    Hong, Qingqi
    Hancock, Edwin R.
    IMAGE ANALYSIS AND PROCESSING (ICIAP 2017), PT II, 2017, 10485 : 127 - 136
  • [7] Entropy-Based Prioritized Sampling in Deep Q-Learning
    Ramicic, Mirza
    Bonarini, Andrea
    2017 2ND INTERNATIONAL CONFERENCE ON IMAGE, VISION AND COMPUTING (ICIVC 2017), 2017, : 1068 - 1072
  • [8] SGC: Similarity-Guided Gradient Compression for Distributed Deep Learning
    Liu, Jingling
    Huang, Jiawei
    Li, Yijun
    Li, Zhaoyi
    Lyu, Wenjun
    Jiang, Wenchao
    Wang, Jianxin
    2024 IEEE/ACM 32ND INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE, IWQOS, 2024,
  • [9] Tail: An Automated and Lightweight Gradient Compression Framework for Distributed Deep Learning
    Guo, Jinrong
    Hu, Songlin
    Wang, Wang
    Yao, Chunrong
    Han, Jizhong
    Li, Ruixuan
    Lu, Yijun
    PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
  • [10] Deep Learning and Entropy-Based Texture Features for Color Image Classification
    Lhermitte, Emma
    Hilal, Mirvana
    Furlong, Ryan
    O'Brien, Vincent
    Humeau-Heurtier, Anne
    ENTROPY, 2022, 24 (11)