Energy-Aware Inference Offloading for DNN-Driven Applications in Mobile Edge Clouds

被引:79
作者
Xu, Zichuan [1 ,2 ]
Zhao, Liqian [1 ,2 ]
Liang, Weifa [3 ]
Rana, Omer F. [4 ]
Zhou, Pan [5 ]
Xia, Qiufen [2 ,6 ]
Xu, Wenzheng [7 ]
Wu, Guowei [1 ,2 ]
机构
[1] Dalian Univ Technol, Sch Software, Dalian 116024, Peoples R China
[2] Key Lab Ubiquitous Network & Serv Software, Dalian 116024, Liaoning, Peoples R China
[3] Australian Natl Univ, Res Sch Comp Sci, Canberra, ACT 2601, Australia
[4] Cardiff Univ, Cardiff CF10 3AT, Wales
[5] Huazhong Univ Sci & Technol, Sch Cyber Sci & Engn, Hubei Engn Res Ctr Big Data Secur, Wuhan 430074, Peoples R China
[6] Dalian Univ Technol, Int Sch Informat Sci & Engn, Dalian 116024, Peoples R China
[7] Sch Sichuan Univ, Chengdu 610017, Peoples R China
基金
澳大利亚研究理事会; 中国国家自然科学基金;
关键词
Artificial intelligence; Cloud computing; 5G mobile communication; Base stations; Task analysis; Mobile handsets; Heuristic algorithms; Inference offloading; mobile edge clouds; approximation and online algorithms; ALGORITHMS; IOT;
D O I
10.1109/TPDS.2020.3032443
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With increasing focus on Artificial Intelligence (AI) applications, Deep Neural Networks (DNNs) have been successfully used in a number of application areas. As the number of layers and neurons in DNNs increases rapidly, significant computational resources are needed to execute a learned DNN model. This ever-increasing resource demand of DNNs is currently met by large-scale data centers with state-of-the-art GPUs. However, increasing availability of mobile edge computing and 5G technologies provide new possibilities for DNN-driven AI applications, especially where these application make use of data sets that are distributed in different locations. One fundamental process of a DNN-driven application in mobile edge clouds is the adoption of "inferencing" - the process of executing a pre-trained DNN based on newly generated image and video data from mobile devices. We investigate offloading DNN inference requests in a 5G-enabled mobile edge cloud (MEC), with the aim to admit as many inference requests as possible. We propose exact and approximate solutions to the problem of inference offloading in MECs. We also consider dynamic task offloading for inference requests, and devise an online algorithm that can be adapted in real time. The proposed algorithms are evaluated through large-scale simulations and using a real world test-bed implementation. The experimental results demonstrate that the empirical performance of the proposed algorithms outperform their theoretical counterparts and other similar heuristics reported in literature.
引用
收藏
页码:799 / 814
页数:16
相关论文
共 56 条
[1]   Cloud-Based Augmentation for Mobile Devices: Motivation, Taxonomies, and Open Challenges [J].
Abolfazli, Saeid ;
Sanaei, Zohreh ;
Ahmed, Ejaz ;
Gani, Abdullah ;
Buyya, Rajkumar .
IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, 2014, 16 (01) :337-368
[2]   Dynamic Task Offloading and Scheduling for Low-Latency IoT Services in Multi-Access Edge Computing [J].
Alameddine, Hyame Assem ;
Sharafeddine, Sanaa ;
Sebbah, Samir ;
Ayoubi, Sara ;
Assi, Chadi .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2019, 37 (03) :668-682
[3]   Energy Cost Models of Smartphones for Task Offloading to the Cloud [J].
Altamimi, Majid ;
Abdrabou, Atef ;
Naik, Kshirasagar ;
Nayak, Amiya .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2015, 3 (03) :384-398
[4]  
[Anonymous], 2017, Probability and Computing: Randomized Algorithms and Probabilistic Analysis
[5]  
[Anonymous], 2013, P 52 ANN DES AUT C D
[6]  
[Anonymous], 1998, New Methods in Language Processing and Computational Natural Language Learning
[7]   Frugal Following: Power Thrifty Object Detection and Tracking for Mobile Augmented Reality [J].
Apicharttrisorn, Kittipat ;
Ran, Xukan ;
Chen, Jiasi ;
Krishnamurthy, Srikanth, V ;
Roy-Chowdhury, Amit K. .
PROCEEDINGS OF THE 17TH CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS (SENSYS '19), 2019, :96-109
[8]  
Bin Gao, 2019, IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, P1459, DOI 10.1109/INFOCOM.2019.8737543
[9]   Fully and Partially Distributed Incentive Mechanism for a Mobile Edge Computing Network [J].
Chattopadhyay, Rajarshi ;
Tham, Chen-Khong .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2022, 21 (01) :139-153
[10]   iRAF: A Deep Reinforcement Learning Approach for Collaborative Mobile Edge Computing IoT Networks [J].
Chen, Jienan ;
Chen, Siyu ;
Wang, Qi ;
Cao, Bin ;
Feng, Gang ;
Hu, Jianhao .
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (04) :7011-7024