An Integrated Cloud-Edge-Device Adaptive Deep Learning Service for Cross-Platform Web

被引:7
作者
Huang, Yakun [1 ]
Qiao, Xiuquan [1 ]
Tang, Jian [2 ]
Ren, Pei [2 ]
Liu, Ling [3 ]
Pu, Calton [3 ]
Chen, Junliang [1 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[2] Midea Grp, Shanghai 201702, Peoples R China
[3] Georgia Inst Technol, Coll Comp, Atlanta, GA 30332 USA
基金
国家重点研发计划;
关键词
Computational modeling; Load modeling; Adaptation models; Deep learning; Collaboration; Cloud computing; Context modeling; Mobile computing; cross-platform web; DNN compression; adaptive service; edge computing; AR;
D O I
10.1109/TMC.2021.3122279
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep learning shows great promise in providing more intelligence to the cross-platform web. However, insufficient infrastructure, heavy models, and intensive computation limit the use of deep learning with low-performing web browsers. We propose DeepAdapter, an integrated cloud-edge-device framework that ties the edge, the remote cloud, with the device by cross-platform web technology for adaptive deep learning services towards lower latency, lower mobile energy, and higher system throughput. DeepAdapter consists of context-aware pruning, service updating, and online scheduling. First, the offline pruning module provides a context-aware pruning algorithm that incorporates the latency, the network condition, and the device's computing capability to fit various contexts. Second, the service updating module optimizes branch model cache on the edge for massive mobile users and updates the new model pruning requirements. Third, the online scheduling module matches optimal branch models for mobile users. Also, a two-stage DRL-based online scheduling method named DeepScheduler can handle high concurrent requests between edge centers and remote cloud by designing the reward prediction model. Extensive experiments show that DeepAdapter can decrease average latency by 1.33x, reduce average mobile energy by 1.4x, and improve system throughput by 2.1x with considerable accuracy.
引用
收藏
页码:1950 / 1967
页数:18
相关论文
共 45 条
[1]  
Bao YX, 2019, IEEE INFOCOM SER, P505, DOI [10.1109/INFOCOM.2019.8737460, 10.1109/infocom.2019.8737460]
[2]  
developer, 2020, ANDR DEB BRIDG
[3]   JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services [J].
Eshratifar, Amir Erfan ;
Abrishami, Mohammad Saeed ;
Pedram, Massoud .
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2021, 20 (02) :565-576
[4]  
faisalman, 2020, UAP JS
[5]   NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision [J].
Fang, Biyi ;
Zeng, Xiao ;
Zhang, Mi .
MOBICOM'18: PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2018, :115-127
[6]  
github, 2020, WOND SHAP
[7]   Multi-Resource Packing for Cluster Schedulers [J].
Grandl, Robert ;
Ananthanarayanan, Ganesh ;
Kandula, Srikanth ;
Rao, Sriram ;
Akella, Aditya .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2014, 44 (04) :455-466
[8]   WebDNN: Fastest DNN Execution Framework on Web Browser [J].
Hidaka, Masatoshi ;
Kikura, Yuichiro ;
Ushiku, Yoshitaka ;
Harada, Tatsuya .
PROCEEDINGS OF THE 2017 ACM MULTIMEDIA CONFERENCE (MM'17), 2017, :1213-1216
[9]  
Hinton G, 2015, Arxiv, DOI [arXiv:1503.02531, DOI 10.48550/ARXIV.1503.02531]
[10]   Toward Decentralized and Collaborative Deep Learning Inference for Intelligent IoT Devices [J].
Huang, Yakun ;
Qiao, Xiuquan ;
Dustdar, Schahram ;
Zhang, Jianwei ;
Li, Jiulin .
IEEE NETWORK, 2022, 36 (01) :59-68