VisionScaling: Dynamic Deep Learning Model and Resource Scaling in Mobile Vision Applications

被引:3
作者
Choi, Pyeongjun [1 ]
Ham, Dongho [1 ]
Kim, Yeongjin [2 ]
Kwak, Jeongho [1 ]
机构
[1] Daegu Gyeongbuk Inst Sci & Technol, Dept Elect Engn & Comp Sci, Daegu 42988, South Korea
[2] Inha Univ, Elect & Comp Engn, Incheon 22212, South Korea
基金
新加坡国家研究基金会;
关键词
Computation offloading; deep learning; dynamic voltage and frequency scaling (DVFS); mobile vision service; model scaling; online convex optimization (OCO); ALLOCATION; OPTIMIZATION;
D O I
10.1109/JIOT.2024.3349512
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As deep learning technology becomes advanced, mobile vision applications, such as augmented reality (AR) or autonomous vehicles, are prevalent. The performance of such services highly depends on computing capability of different mobile devices, dynamic service requests, stochastic mobile network environment, and learning models. Existing studies have independently optimized such mobile resource allocation and learning model design with given other side of parameters and computing/network resources. However, they cannot reflect realistic mobile environments since the time-varying wireless channel and service requests are assumed to follow specific distributions. Without these unrealistic assumptions, we propose an algorithm that jointly optimizes learning models and process/network resources adapting to system dynamics, namely, VisionScaling by leveraging the state-of-the-art online convex optimization (OCO) framework. This VisionScaling jointly makes decisions on 1) the learning model and the size of input layer at learning-side and 2) the GPU clock frequency, the transmission rate, and the computation offloading policy at resource-side every time slot. We theoretically show that VisionScaling asymptotically converges to an offline optimal performance with satisfying sublinearity. Moreover, we demonstrate that VisionScaling saves at least 24% of dynamic regret which captures energy consumption and processed frames per second (PFPS) under mean average precision (mAP) constraint via real trace-driven simulations. Finally, we show that VisionScaling attains 30.8% energy saving and improves 39.7% PFPS while satisfying the target mAP on the testbed with Nvidia Jetson TX2 and an edge server equipped with high-end GPU.
引用
收藏
页码:15523 / 15539
页数:17
相关论文
共 36 条
[1]   Optimal GPU Frequency Selection using Multi-Objective Approaches for HPC Systems [J].
Ali, Ghazanfar ;
Bhalachandra, Sridutt ;
Wright, Nicholas J. ;
Side, Mert ;
Chen, Yong .
2022 IEEE HIGH PERFORMANCE EXTREME COMPUTING VIRTUAL CONFERENCE (HPEC), 2022,
[2]  
Bertsekas DimitriP., 2017, DYNAMIC PROGRAMMING, V1
[3]  
Besanko A., 2011, Microeconomics
[4]  
Cai Han, 2020, INT C LEARN REPR ICL
[5]   Bandit Convex Optimization for Scalable and Dynamic IoT Management [J].
Chen, Tianyi ;
Giannakis, Georgios B. .
IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (01) :1276-1286
[6]   An Online Convex Optimization Approach to Proactive Network Resource Allocation [J].
Chen, Tianyi ;
Ling, Qing ;
Giannakis, Georgios B. .
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2017, 65 (24) :6350-6364
[7]   Energy Efficient Dynamic Offloading in Mobile Edge Computing for Internet of Things [J].
Chen, Ying ;
Zhang, Ning ;
Zhang, Yongchao ;
Chen, Xin ;
Wu, Wen ;
Shen, Xuemin .
IEEE TRANSACTIONS ON CLOUD COMPUTING, 2021, 9 (03) :1050-1060
[8]   NestDNN: Resource-Aware Multi-Tenant On-Device Deep Learning for Continuous Mobile Vision [J].
Fang, Biyi ;
Zeng, Xiao ;
Zhang, Mi .
MOBICOM'18: PROCEEDINGS OF THE 24TH ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING, 2018, :115-127
[9]   AutoML for Video Analytics with Edge Computing [J].
Galanopoulos, Apostolos ;
Ayala-Romero, Jose A. ;
Leith, Douglas J. ;
Iosifidis, George .
IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2021), 2021,
[10]  
Goldsmith A., 2005, WIRELESS COMMUNICATI, DOI 10.1017/CBO9780511841224