Hierarchical deployment of deep neural networks based on fog computing inferred acceleration model

被引:5
作者
Jiang, Weijin [1 ,2 ]
Lv, Sijian [1 ]
机构
[1] Hunan Univ Technol & Business, Coll Comp & Informat Engn, Changsha 410205, Peoples R China
[2] Mobile E Business Collaborat Innovat Ctr Hunan Pr, Changsha 410205, Peoples R China
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2021年 / 24卷 / 04期
基金
中国国家自然科学基金;
关键词
Fog computing; Branching neural networks; Inferential acceleration; Cloud;
D O I
10.1007/s10586-021-03298-0
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the widespread adoption of deep neural network approaches in mobile devices, the drawbacks of traditional cloud-based or local deployment solutions are becoming apparent. The high latency caused by cloud-based inferred neural networks and the high power consumption caused by local inference of mobile devices greatly reduce the actual user experience of neural network applications. To solve this problem, this paper proposes a hierarchical deployment and inference acceleration model for deep neural networks based on fog computing. Firstly, a search reallocation of the solution space of a deep neural network is performed, and a Solution Space Tree Pruning (SSTP) based deployment algorithm is designed to select the appropriate network layer for deployment in order to reduce the overall inferred delay of a deep neural network. Next, an algorithm for Maximizing Accuracy based on Guaranteed Latency (MAL) is designed. On the pruned solution-space tree of the SSTP algorithm, the appropriate fog computation nodes are selected for mobile devices in different geographic locations to achieve early exit from the inference task according to the operational delay and inference accuracy requirements of real device terminals. The experimental results show that the proposed solution reduces the average latency of the fog computing-based inference acceleration model by 44.79% compared to the traditional cloud-deployed deep neural network inference, and by 28.75% compared to the edge computing acceleration framework in existing studies. The model meets the minimum latency and accuracy of neural network inference in multiple fog computing scenarios. At the same time, it greatly reduces the performance occupation and case cost of the cloud under the traditional cloud computing model.
引用
收藏
页码:2807 / 2817
页数:11
相关论文
共 24 条
[1]  
ABESHU A, 2018, IEEE COMMUN MAG, V56, P169, DOI DOI 10.1109/MCOM.2018.1700332
[2]   Drop computing: Ad-hoc dynamic collaborative computing [J].
Ciobanu, Radu-Ioan ;
Negru, Catalin ;
Pop, Florin ;
Dobre, Ciprian ;
Mavromoustakis, Constandinos X. ;
Mastorakis, George .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 92 :889-899
[3]   Energy-efficient offloading of real-time tasks using cloud computing [J].
Elashri, Suzanne ;
Azim, Akramul .
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2020, 23 (04) :3273-3288
[4]  
Fan Qi, 2020, Journal of Computer Applications, V40, P342, DOI 10.11772/j.issn.1001-9081.2019081406
[5]   Distributed learning of deep neural network over multiple agents [J].
Gupta, Otkrist ;
Raskar, Ramesh .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2018, 116 :1-8
[6]   FireCaffe: near-linear acceleration of deep neural network training on compute clusters [J].
Iandola, Forrest N. ;
Moskewicz, Matthew W. ;
Ashraf, Khalid ;
Keutzer, Kurt .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2592-2600
[7]  
JIANG W, 2020, IEEE T COMPUT SOC SY
[8]  
Jiang WJ, 2020, CHINA COMMUN, V17, P229, DOI 10.23919/JCC.2020.10.017
[9]  
KANEV A, 2017, 2017 20 C OP INN ASS, P118
[10]   Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge [J].
Kang, Yiping ;
Hauswald, Johann ;
Gao, Cao ;
Rovinski, Austin ;
Mudge, Trevor ;
Mars, Jason ;
Tang, Lingjia .
ACM SIGPLAN NOTICES, 2017, 52 (04) :615-629