Exploiting Computation Reuse in Cloud-Based Deep Learning via Input Reordering

被引:1
作者
Guo, Enting [1 ]
Li, Peng [2 ]
Wang, Kun [3 ]
Feng, Huibin [4 ]
Lu, Jingyuan [1 ]
Guo, Song [5 ]
机构
[1] Nanjing Univ Posts & Telecommun, Nanjing, Peoples R China
[2] Univ Aizu, Aizu Wakamatsu, Fukushima, Japan
[3] Univ Calif Los Angeles, Los Angeles, CA USA
[4] Minjiang Univ, Fuzhou, Peoples R China
[5] Hong Kong Polytech Univ, Hong Kong, Peoples R China
来源
ICC 2020 - 2020 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC) | 2020年
关键词
Deep Learning; Computation Reuse; Cloud; Computing;
D O I
10.1109/icc40277.2020.9148746
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, deep learning (DL) becomes increasingly important since its transformative effect on a wide range of applications. During inference process, the DL model is deployed on the cloud to answer online queries. One crucial issue in the progress of DL inference is energy consumption, which significantly retards computation performance. Therefore, many previous investigations decrease the energy consumption via computation reuse technique based on similarity. However, if input data consists individually from mobile devices, applying these schemes will significantly decline computation performance. Because in disordered individual inputs, similarity for reuse is difficult to exploit directly. Results of initial experimental observations show that (1) individual input data also has high similarity for reuse, and (2) the total similarity during computation process has a relation with the characteristics of input data. This motivates us to design a reordering scheme to enhance similarity for computation reuse. Our main approaches are using statistical theory to predict the similarities among input data, and determining the execution sequence. Based on these approaches, we propose an effective input reordering scheme for computation reuse to save energy consumption. The evaluation under various benchmarks demonstrates that the reordering scheme significantly outperforms the previous schemes, for instance, the computation reuse is enhanced to 1.1x and the energy consumption is minimized to 40% according to the configuration of traditional computation reuse technique.
引用
收藏
页数:6
相关论文
共 21 条
[1]  
Abadi M, 2016, ACM SIGPLAN NOTICES, V51, P1, DOI [10.1145/3022670.2976746, 10.1145/2951913.2976746]
[2]  
[Anonymous], 2019, IEEE T COMPUTERS
[3]   Eyeriss: A Spatial Architecture for Energy-Efficient Dataflow for Convolutional Neural Networks [J].
Chen, Yu-Hsin ;
Emer, Joel ;
Sze, Vivienne .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :367-379
[4]  
Cormen TH., 1990, Introduction to Algorithms
[5]  
Guo E., P CBD 2019
[6]  
He J., P ICDCS 2018
[7]  
He X., 2019, IEEE COMPUT INTELL M
[8]   Green Resource Allocation Based on Deep Reinforcement Learning in Content-Centric IoT [J].
He, Xiaoming ;
Wang, Kun ;
Huang, Huawei ;
Miyazaki, Toshiaki ;
Wang, Yixuan ;
Guo, Song .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2020, 8 (03) :781-796
[9]   UCNN: Exploiting Computational Reuse in Deep Neural Networks via Weight Repetition [J].
Hegde, Kartik ;
Yu, Jiyong ;
Agrawal, Rohit ;
Yan, Mengjia ;
Pellauer, Michael ;
Fletcher, Christopher W. .
2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, :674-687
[10]   ZeNA: Zero-Aware Neural Network Accelerator [J].
Kim, Dongyoung ;
Ahn, Junwhan ;
Yoo, Sungjoo .
IEEE DESIGN & TEST, 2018, 35 (01) :39-46