Joint DNN Partition Deployment and Resource Allocation for Delay-Sensitive Deep Learning Inference in IoT

被引:95
作者
He, Wenchen [1 ]
Guo, Shaoyong [1 ]
Guo, Song [2 ,3 ]
Qiu, Xuesong [1 ]
Qi, Feng [1 ,4 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Hong Kong, Peoples R China
[3] Hong Kong Polytech Univ, Res Inst Sustainable Urban Dev, Hong Kong, Peoples R China
[4] Cyberspace Secur Res Ctr, Peng Cheng Lab, Shenzhen 518066, Peoples R China
基金
中国国家自然科学基金;
关键词
Delays; Task analysis; Resource management; Internet of Things; Computational modeling; Partitioning algorithms; Approximation algorithms; Deep learning (DL); delay sensitive; inference; Internet of Things (IoT); mobile-edge computing (MEC); partition deployment; resource allocation; EDGE; SERVICE; CLOUD; INTELLIGENCE; INTERNET; DISCOVERY; MIGRATION; QUALITY;
D O I
10.1109/JIOT.2020.2981338
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Nowadays, the widely used Internet-of-Things (IoT) mobile devices (MDs) generate huge volumes of data, which need analyzing and extracting accurate information in real time by compute-intensive deep learning (DL) inference tasks. Due to its multilayer structure, the deep neural network (DNN) is appropriate for the mobile-edge computing (MEC) environment, and the DL tasks can be offloaded to DNN partitions deployed in MEC servers (MECSs) for speed-up inference. In this article, we first assume the arrival process of DL tasks as Poisson distribution and develop a tandem queueing model to evaluate the end-to-end (E2E) inference delay of DL tasks in multiple DNN partitions. To minimize the E2E delay, we develop a joint optimization problem model of partition deployment and resource allocation in MECSs (JPDRA). Since the JPDRA is a mixed-integer nonlinear programming (MINLP) problem, we decompose the original problem into a computing resource allocation (CRA) problem with fixed partition deployment decision and a DNN partition deployment (DPD) problem that optimizes the optimal-delay function related to the CRA problem. Next, we design a CRA algorithm based on Markov approximation and a low-complexity DPD algorithm to obtain the near-optimal solution in the polynomial time. The simulation results demonstrate that the proposed algorithms are more efficient and can reduce the average E2E delay by 25.7% with better convergence performance.
引用
收藏
页码:9241 / 9254
页数:14
相关论文
共 43 条
[1]  
[Anonymous], 1987, Data Networks
[2]   Deep Learning With Edge Computing: A Review [J].
Chen, Jiasi ;
Ran, Xukan .
PROCEEDINGS OF THE IEEE, 2019, 107 (08) :1655-1674
[3]   Markov Approximation for Combinatorial Network Optimization [J].
Chen, Minghua ;
Liew, Soung Chang ;
Shao, Ziyu ;
Kai, Caihong .
IEEE TRANSACTIONS ON INFORMATION THEORY, 2013, 59 (10) :6301-6327
[4]   Efficient Resource Allocation for On-Demand Mobile-Edge Cloud Computing [J].
Chen, Xu ;
Li, Wenzhong ;
Lu, Sanglu ;
Zhou, Zhi ;
Fu, Xiaoming .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2018, 67 (09) :8769-8780
[5]   ThriftyEdge: Resource-Efficient Edge Computing for Intelligent IoT Applications [J].
Chen, Xu ;
Shi, Qian ;
Yang, Lei ;
Xu, Jie .
IEEE NETWORK, 2018, 32 (01) :61-65
[6]   Deterministic Quality of Service Guarantee for Dynamic Service Chaining in Software Defined Networking [J].
Chen, Yu-Jia ;
Wang, Li-Chun ;
Lin, Feng-Yi ;
Lin, Bao-Shuh Paul .
IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT, 2017, 14 (04) :991-1002
[7]  
Diaconis P., 1991, Ann. Appl. Probab., P36, DOI [DOI 10.1214/AOAP/1177005980, DOI 10.1214/aoap/1177005980]
[8]   Learning for Computation Offloading in Mobile Edge Computing [J].
Dinh, Thinh Quang ;
La, Quang Duy ;
Quek, Tony Q. S. ;
Shin, Hyundong .
IEEE TRANSACTIONS ON COMMUNICATIONS, 2018, 66 (12) :6353-6367
[9]   State-of-the-Art Deep Learning: Evolving Machine Intelligence Toward Tomorrow's Intelligent Network Traffic Control Systems [J].
Fadlullah, Zubair Md. ;
Tang, Fengxiao ;
Mao, Bomin ;
Kato, Nei ;
Akashi, Osamu ;
Inoue, Takeru ;
Mizutani, Kimihiro .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2017, 19 (04) :2432-2455
[10]   Mobile-Edge Computation Offloading for Ultradense IoT Networks [J].
Guo, Hongzhi ;
Liu, Jiajia ;
Zhang, Jie ;
Sun, Wen ;
Kato, Nei .
IEEE INTERNET OF THINGS JOURNAL, 2018, 5 (06) :4977-4988