Spectral Tensor Layers for Communication-Free Distributed Deep Learning

被引:0
作者
Liu, Xiao-Yang [1 ,2 ]
Wang, Xiaodong [1 ]
Yuan, Bo [3 ]
Han, Jiashu [1 ]
机构
[1] Columbia Univ, Dept Elect Engn, New York, NY 10027 USA
[2] Rensselaer Polytech Inst, Dept Comp Sci, Troy, NY 12180 USA
[3] Rutgers State Univ, Dept Elect & Comp Engn, Piscataway, NJ 08854 USA
关键词
Communication-free; deep learning; federated learning (FL); linear transform; multiresolution heterogeneous data; spectral tensor layer; tensor; STRATEGIES;
D O I
10.1109/TNNLS.2024.3394861
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, we propose a novel spectral tensor layer for communication-free distributed deep learning. The overall framework is as follows: first, we represent the data in tensor form (instead of vector form) and replace the matrix product in conventional neural networks with the tensor product, which in effect imposes certain transformed-induced structure on the original weight matrices, e.g., a block-circulant structure; then, we apply a linear transform along a certain dimension to split the original dataset into multiple spectral subdatasets; as a result, the proposed spectral tensor network consists of parallel branches where each branch is a conventional neural network trained on a spectral subdataset with ZERO communication cost. The parallel branches are directly ensembled (i.e., the weighted sum of their outputs) to generate an overall network with substantially stronger generalization capability than that of each branch. Moreover, the proposed method enjoys a byproduct of decentralization gain in terms of memory and computation, compared with traditional networks. It is a natural yet elegant solution for heterogeneous data in federated learning (FL), where data at different nodes have different resolutions. Finally, we evaluate the proposed spectral tensor networks on the MNIST, CIFAR-10, ImageNet-1K, and ImageNet-21K datasets, respectively, to verify that they simultaneously achieve communication-free distributed learning, distributed storage reduction, parallel computation speedup, and learning with multiresolution data.
引用
收藏
页码:7237 / 7251
页数:15
相关论文
共 48 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   Machine Learning at the Wireless Edge: Distributed Stochastic Gradient Descent Over-the-Air [J].
Amiri, Mohammad Mohammadi ;
Gunduz, Deniz .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2020, 68 (68) :2155-2169
[3]  
Ba J, 2014, ACS SYM SER
[4]  
Chen J., 2016, P ICLR WORKSH, P1
[5]   An Exploration of Parameter Redundancy in Deep Networks with Circulant Projections [J].
Cheng, Yu ;
Yu, Felix X. ;
Feris, Rogerio S. ;
Kumar, Sanjiv ;
Choudhary, Alok ;
Chang, Shih-Fu .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :2857-2865
[6]   Tensor-Factorized Neural Networks [J].
Chien, Jen-Tzung ;
Bao, Yi-Ting .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) :1998-2011
[7]   NVIDIA A100 Tensor Core GPU: Performance and Innovation [J].
Choquette, Jack ;
Gandhi, Wishwesh ;
Giroux, Olivier ;
Stam, Nick ;
Krashinsky, Ronny .
IEEE MICRO, 2021, 41 (02) :29-35
[8]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[9]  
Denil M., 2013, ADV NEURAL INFORM PR, P2148, DOI DOI 10.5555/2999792.2999852
[10]   CIRCNN: Accelerating and Compressing Deep Neural Networks Using Block-Circulant Weight Matrices [J].
Ding, Caiwen ;
Liao, Siyu ;
Wang, Yanzhi ;
Li, Zhe ;
Liu, Ning ;
Zhuo, Youwei ;
Wang, Chao ;
Qian, Xuehai ;
Bai, Yu ;
Yuan, Geng ;
Ma, Xiaolong ;
Zhang, Yipeng ;
Tang, Jian ;
Qiu, Qinru ;
Lin, Xue ;
Yuan, Bo .
50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :395-408